Hello.
I've been using Graphic Matroids to deal with some enumeration problems, where I filter a list of graphs based on their flats. Even without really undestanding what happens at low level, I could verify that the computation is faster when I build the equivalent representation of each graphic matroid using the Incidence matrix of the graph over GF(2).
However, one One big issue that I am facing is that some memory leak is ocurring while using this type of construction (a BinaryMatroid
). ocurring. As the size of my list of graphs grows, the ram memory got consumed until a point that my computer breaks down and restarts. I know the problem is related with the construction because I've tested the same code using a GraphicMatroid
and the get_memory_usage()
shows no variation. Unfortunately, it becomes too slow for my purposes.
My way out has been to use a bash script to keep calling the main program until I reach the end of an external text file (the input). The main part of the code where I identified the problem is simply:
What do you suggest?
To force the gc.collect()
seems to not be an option, because the program becomes really slow.
EDIT:
I think I was wrong about the problem's root. First of all, it is needed to be said that I am working with lists of thousands to millions of graphs and at first I thought it would be the cause (an unavoidable residue of memory, etc) .
However, the EDIT 1:
The memory leak seems to be related with the is_connected()
method. Here is a piece of my code showing two different situations:
With the use of is_connected()
:
sage: G = [g for g in graphs.nauty_geng("12 16:16 -Ct")]
sage: len(G)
11771
sage: print(get_memory_usage())
....:
....: for g in G:
....: A = g.incidence_matrix()
....: M = Matroid(A, ring = GF(2))
....: if M.is_connected():
....: pass
....: print(get_memory_usage())
....:
3396.38671875
3418.06640625
and without:
sage: print(get_memory_usage())
....: for g in G:
....: A = g.incidence_matrix()
....: M = Matroid(A, ring = GF(2))
....: for r in range(M.corank()):
....: for F in M.coflats(r):
....: pass
....:
....: print(get_memory_usage())
....:
3490.53515625
3490.53515625
Now I am EDIT 2:
Ok, I solved my problem by just explicitly declaring the function is_connected()
in the body of my code, instead of using the method present in the abstract matroid class. I'm not sure about why it really worked, I tried this approach because I thought it had something to do with the lists that are created by the components()
method, however I stopped investigating when I notice the memory leak vanished just by using the function. To be honest, I changed the script to not return any output (there was an option to input a certificate flag and return the components). Unfortunately, my program became a bit slower (20% average).
I'm letting the code below for the case someone faces a similar problem or has any better solution or clue about why it happened:
cpdef components(self):
B = self.basis()
components = [frozenset([e]) for e in self.groundset()]
for e in self.groundset() - B:
C = self.circuit(B | set([e]))
components2 = []
for comp in components:
if the issue is really associated with the binary matroid representation. Maybe it is some misconception of mine. Even though, I would be glad len(C & comp) != 0:
C = C | comp
else:
components2.append(comp)
components2.append(C)
components = components2
return components
cpdef is_connected(self):
if you could help find it out.len(comp) == 1:
return True
else:
return False