ASKSAGE: Sage Q&A Forum - Latest question feedhttp://ask.sagemath.org/questions/Q&A Forum for SageenCopyright Sage, 2010. Some rights reserved under creative commons license.Thu, 12 Sep 2019 04:20:58 -0500Memory leak with matroid is_connected methodhttp://ask.sagemath.org/question/47839/memory-leak-with-matroid-is_connected-method/Hello.
I've been using Graphic Matroids to deal with some enumeration problems, where I filter a list of graphs based on their flats. Even without really undestanding what happens at low level, I could verify that the computation is faster when I build the equivalent representation of each graphic matroid using the Incidence matrix of the graph over GF(2).
One big issue that I am facing is that some memory leak is ocurring. As the size of my list of graphs grows, the ram memory got consumed until a point that my computer breaks down and restarts. My way out has been to use a bash script to keep calling the main program until I reach the end of an external text file (the input).
To force the `gc.collect()` seems to not be an option, because the program becomes really slow.
EDIT 1:
The memory leak seems to be related with the `is_connected` method. Here is a piece of my code showing two different situations:
With the use of `is_connected`:
sage: G = [g for g in graphs.nauty_geng("12 16:16 -Ct")]
sage: len(G)
11771
sage: print(get_memory_usage())
....:
....: for g in G:
....: A = g.incidence_matrix()
....: M = Matroid(A, ring = GF(2))
....: if M.is_connected():
....: pass
....: print(get_memory_usage())
....:
3396.38671875
3418.06640625
and without:
sage: print(get_memory_usage())
....: for g in G:
....: A = g.incidence_matrix()
....: M = Matroid(A, ring = GF(2))
....: for r in range(M.corank()):
....: for F in M.coflats(r):
....: pass
....:
....: print(get_memory_usage())
....:
3490.53515625
3490.53515625
EDIT 2:
Ok, I solved my problem by just explicitly declaring the function `is_connected` in the body of my code, instead of using the method present in the abstract matroid class. I'm not sure about why it really worked, I tried this approach because I thought it had something to do with the lists that are created by the `components` method, however I stopped investigating when I notice the memory leak vanished just by using the function. To be honest, I changed the script to not return any output (there was an option to input a certificate flag and return the components).
I'm letting the code below for the case someone faces a similar problem or has any better solution or clue about why it happened:
cpdef components(self):
B = self.basis()
components = [frozenset([e]) for e in self.groundset()]
for e in self.groundset() - B:
C = self.circuit(B | set([e]))
components2 = []
for comp in components:
if len(C & comp) != 0:
C = C | comp
else:
components2.append(comp)
components2.append(C)
components = components2
return components
cpdef is_connected(self):
comp = components(self)
if len(comp) == 1:
return True
else:
return FalsefmorlinThu, 12 Sep 2019 04:20:58 -0500http://ask.sagemath.org/question/47839/Memory leak in Polyhedron?http://ask.sagemath.org/question/47243/memory-leak-in-polyhedron/If I run the following code:
import gc
while True:
P = Polyhedron([[1,0,0,0], [0,1,0,0], [0,0,1,0], [0,0,0,1]])
faces = P.faces(3)
del P, faces
print "memory usage: " + str(get_memory_usage()) + ", gc: " + str(gc.collect())
the memory usage keeps increasing.
Do you know what is the cause and how to avoid this problem?GioveMon, 22 Jul 2019 09:15:08 -0500http://ask.sagemath.org/question/47243/NEW VERSION: Sparse matrix stores something for each row?http://ask.sagemath.org/question/44709/new-version-sparse-matrix-stores-something-for-each-row/**NEW VERSION. This is what this question has boiled down to upon further investigation. Everything below this paragraph in boldface and the subsequent code is the old version and serves purely historical purposes =) I hope this was the right thing to do rather than creating a new question. The issue is that the amount of memory occupied by a sparse matrix seems to depend linearly on the number of rows in the matrix. In particular, my setup wont even let me create a zero $10^9\times 1$ sparse matrix but has no problem with a zero $1\times 10^9$ sparse matrix. This can all be illustrated by running the below code (and it's hardly the way things are supposed to be).**
print get_memory_usage()
B=matrix(QQ, 1, 1000000000, {})
print get_memory_usage()
B=matrix(QQ, 10000000, 1, {})
print get_memory_usage()
B=matrix(QQ, 1000000000, 1, {})
print get_memory_usage()
---------------------------------------------------------------------------------------------------------------------------------
I'm running the following code. Here I create a sparse $10^5\times 10^5$ identity matrix and then apply it repeatedly to a vector in $\mathbb R^{10^5}$ with 100 nonzero coordinates (which is stored as $10^5\times 1$ sparse matrix). Each time I add the result to a list until I run out of 2 GB of memory.
A=matrix(QQ, 100000, 100000, {(i,i):1. for i in range(100000)})
print get_memory_usage()
B=[matrix(QQ, 100000, 1, {(i,0):1. for i in range(100)})]
while (get_memory_usage()<2000): B.append(A*B[-1])
print len(B)
print get_memory_usage()
del B
del A
print get_memory_usage()
I'm receiving (on a freshly started kernel)
204.54296875
196
2003.51953125
1399.0234375
This raises two questions.
1. Why is there so much memory (1.4 GB) still in use after I successfully ran the code and deleted both objects I've created? That's a leak, right?
2. Why does deleting a list of 196 sparse matrices with 100 nonzero elements each free up 600 MB? Each such matrix should only take up a few KB, right?
I'm on Windows 8.1/SAGE 8.4.
**UPDATE**. Transposing the matrices, i.e. writing
...
B=[matrix(QQ, 1, 100000, {(0,i):1. for i in range(100)})]
for i in range(200): B.append(B[-1]*A)
...
seems to work well memory-wise, it returns
214.50390625
201
215.53125
186.62109375
However, it takes up much more time than the first version. This is probably due to the implementation of sparse matrix multiplication unfortunately containing a [loop over the columns of the right matrix](https://github.com/sagemath/sage/blob/6187d261eca3c980e575b53d1a31f580ba8cfdfd/src/sage/matrix/matrix_rational_sparse.pyx#L184). Is there any simple way around this high memory/low speed dilemma?
**UPDATE 2.** This might not be a memory leak and have more to do with the implementation of sparse matrices in general rather than their multiplication in particular. Apparently, a sparse matrix stores something for *each* of its rows as shown by
print get_memory_usage()
B=matrix(QQ, 1, 10000000, {})
print get_memory_usage()
B=matrix(QQ, 10000000, 1, {})
print get_memory_usage()
165.45703125
165.51953125
1082.68359375
This has got to be a known issue. I was not able to find a discussion, however. (This might be what is known as `csr_matrix` in `scipy` but why this would be chosen as the general standard here is beyond me.)imakhlinTue, 18 Dec 2018 18:20:27 -0600http://ask.sagemath.org/question/44709/Memory leak when doing ANF of boolean functions?http://ask.sagemath.org/question/35623/memory-leak-when-doing-anf-of-boolean-functions/I am using SageMath version 7.2 (EDIT: I confirm the same behavior under Sage version 7.4)
sage: B = random_boolean_function(3)
sage: get_memory_usage()
732.5078125
sage: B.algebraic_normal_form()
x0*x1*x2 + x0*x1 + x0 + x2
sage: get_memory_usage()
743.53125
sage: B.algebraic_normal_form()
x0*x1*x2 + x0*x1 + x0 + x2
sage: get_memory_usage()
749.04296875
sage: B.algebraic_normal_form()
x0*x1*x2 + x0*x1 + x0 + x2
sage: get_memory_usage()
754.6875
sage: B.algebraic_normal_form()
x0*x1*x2 + x0*x1 + x0 + x2
sage: get_memory_usage()
760.19921875
After each ANF call the memory used is **rising**. Is this a memory leak?sageuser1Thu, 17 Nov 2016 02:50:56 -0600http://ask.sagemath.org/question/35623/Using Sage's save and load methods leads to high memory usagehttp://ask.sagemath.org/question/28850/using-sages-save-and-load-methods-leads-to-high-memory-usage/ I am using Sage to store and load big matrices and I am monitoring the memory usage of the program with htop in a separated terminal window. (is there a better way to look how the sage is using the memory?)
Apparently, the software expends too much memory when saves and loads objects. Anyone knows what is going on and how to workaround it?
----------
Next, an example code to show it. Commented, the memory used by the system on each step:
# 211 Mb
A = Matrix.random(RealField(200)['u','v','t'], 200)
# 291 Mb (+ 80 Mb)
save(A, 'A')
# 458 Mb (+ 168 Mb)
# A.sobj has size 8.55 Mb on disk
Closing and starting a new session of sage...
# 211 Mb
A = load('A')
# 360 Mb (+ 149 Mb)
So, if the matrix A occupies 80Mb, why almost the double of this is used when saving it? And I don't know how to recuperate it without closing the sage session. Also, why when loading A, again almost the double of the memory is used?
----------
I've found a related question on stackoverflow:
http://stackoverflow.com/questions/20294628/using-pythons-pickle-in-sage-results-in-high-memory-usage
But use the open/write method suggested there leads to the same behavior here...FabricioSun, 16 Aug 2015 16:38:42 -0500http://ask.sagemath.org/question/28850/Computation gets killed every couple of hourshttp://ask.sagemath.org/question/23350/computation-gets-killed-every-couple-of-hours/I'm running the following computation (in parallel for different range of curves) but sage keps killing my computation every couple of hours. For the first few curves, it does fine and then when computation time takes something in the range of >3000s (~1hr), it will suddenly kill the computation without any warning. This had led to a lot of frustration because there is no way for me to tell what is actually going wrong.
We initially thought it was a memory leak issue and hence we added in gc.enable and this only helped for the simpler computations. It still didn't seem to make any difference for the ones that would have failed in the case where gc.enable was not added in previously.
I can understand if the computations may take a really long because the computation for the L-series is exponential (?), but that doesn't seem to be a reason for the computation to get killed. Also, when I start again at the last prime p that the computation gets killed at, sometimes it works and the computation carries on but gets killed later, sometimes it doesn't work at all.
Could anybody potentially help to remedy this or point me in the direction of how to make the code better? Thanks.
curves_list= ['30502b1', '30503b1', '30518c1', '30518d1', '30519f1', '30520a1', '30525w1', '30525x1', '30525bb1', '30530c1', '30530f1', '30534a1', '30535a1', '30535c1', '30537i1', '30537l1', '30544f1', '30544k1', '30550b1', '30550e1', '30550n1', '30550q1', '30550v1', '30550y1', '30558a1', '30558d1', '30564b1', '30564c1', '30564l1', '30565b1', '30566b1', '30573d1', '30575c1', '30576b1', '30576u1', '30576bf1', '30576bz1', '30576cb1', '30576cr1', '30576cs1', '30585c1', '30589b1', '30589d1', '30594a1', '30594b1', '30594d1']
##Step 3: find the last curve you computed data for
import gc
gc.enable()
i = curves_list.index('30519f1') #this is the curve for which it stopped
##Step 4: put in the [i:] in the next line
for x in curves_list[i:]: ##put in [i:]
a = gc.collect()
E = EllipticCurve(x)
print E.cremona_label()
sys.stdout.flush()
##Step 5: now add this as a case to pick up where you left off
if x == '30519f1':
##add in the starting prime
for p in prime_range(224,1000):
if E.is_good(p) and E.is_ordinary(p):
t1 = cputime()
output = E.sha().p_primary_bound(p)
print 'memory usage: ', get_memory_usage()
a=gc.collect()
t2 = cputime()
a=gc.collect()
print 'bound at p=%s is %s'%(p,output)
sys.stdout.flush()
print 'memory usage: ', get_memory_usage()
sys.stdout.flush()
a=gc.collect()
if output > 0:
print 'BOUND IS > 0 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!'
sys.stdout.flush()
print 'time to compute: ', t2 - t1
sys.stdout.flush()
a=gc.collect()
##Step 6: add in an else and the code below
else:
time_count_E_start = cputime()
for p in prime_range(5,1000):
if E.is_good(p) and E.is_ordinary(p):
t1 = cputime()
output = E.sha().p_primary_bound(p)
print 'memory usage: ', get_memory_usage()
a=gc.collect()
t2 = cputime()
a=gc.collect()
print 'bound at p=%s is %s'%(p,output)
sys.stdout.flush()
print 'memory usage: ', get_memory_usage()
sys.stdout.flush()
a=gc.collect()
if output > 0:
print 'BOUND IS > 0 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!'
sys.stdout.flush()
print 'time to compute: ', t2 - t1
sys.stdout.flush()
a=gc.collect()
a=gc.collect()
time_count_E_end = cputime()
print 'total time to compute for %s: %s'%(E.cremona_label(), time_count_E_end - time_count_E_start)
print 30*'-'
a=gc.collect()
Also, if it helps, this is the error message that I keep on getting:
/usr/local/sage/sage-6.2.rc2/local/bin/sage-python: line 2: 6182 Killed
sage -python "$@"yeohaikalThu, 10 Jul 2014 11:55:19 -0500http://ask.sagemath.org/question/23350/Force Gap to forget graph automorphism groupshttp://ask.sagemath.org/question/10389/force-gap-to-forget-graph-automorphism-groups/I'm interested in computing and storing a huge number of graphs. For each graph I compute, I temporarily need to compute its automorphism group -- but I don't need to, and don't want to, remember that automorphism group. It looks like sage deals with graphs on its own but calls Gap to deal with automorphism groups.
The problem is that Gap, not sage, exceeds it permitted memory. Why? Here is a (silly) example of what I'm talking about: running
mylist=[];
count = 1;
while True:
g = graphs.RandomGNP(6, .5);
aut = g.automorphism_group();
mylist.append(g);
if aut.order() > 1:
print "Graph number %s has nontrivial automorphisms!"%(count);
count += 1;
gives me after some hours
RuntimeError: Gap produced error output
Error, exceeded the permitted memory (`-o' command line option)
How can I force sage/gap to forget the automorphism groups attached to the graphs (but remember the graphs)? Ideally Gap should only need to store one automorphism group at a time.MTFri, 26 Jul 2013 02:43:43 -0500http://ask.sagemath.org/question/10389/Runaway memory usage in Sage 5.0?http://ask.sagemath.org/question/9180/runaway-memory-usage-in-sage-50/Hi,
I am running Sage 5.0 on Windows 7 (as it is the latest Windows version available) and my code is crashing after a couple of hours of computation. Downgrading to Sage 4.8 fixes the problem. I'm not sure exactly where the issue is so I will try to say as much about what I'm doing as possible.
I am using the algorithm described in this paper:
http://www.springerlink.com/content/1f6xdjt3a7e7h0qn/
to build a database of the lattices of order $n$ up to isomorphism. I am up to $n=12$ so far, and my goal is to reach $n=15$. The program works by generating the lattices of order $n+1$ from the lattices of order $n$.
As such, I am using lots of Posets and LatticePosets. Sage should not have to store in memory more than a thousand or so Posets on $\leq 15$ nodes at any point during the code's execution, and should not have to hold much else in memory beyond these posets. My code takes as input the lattices of order $n$ and writes the lattices of order $n+1$ as it generates them to a file. I am running Sage 5.0 in VirtualBox with 4 processors and 1500MB RAM allocated.
My code uses the @parallel decorator on one function. With this, the overall memory usage of my system climbs rapidly from what it was before (X) to X+1500MB, and after a few hours one of the return values from the parallelized function will be 'NO DATA' (instead of what I expected, which is a short list of posets), which tells me something went wrong. If I remove the @parallel decorator and just call my function with single inputs instead of lists of inputs, the memory usage of my system rises rapidly to X+1500MB and after a few hours the entire Sage virtual machine just shuts down.
However, if I downgrade to Sage 4.8, dedicate 4 processors and only 1250MB RAM to Virtualbox, I can use the @parallel decorator and my code will run stably for hours and eventually complete, without my system ever going over X+1000MB memory usage.
Does anyone have any idea what's going on here? Is Sage 5.0 caching all of the lattices of order $n+1$ that I'm generating along the way and eventually running out of memory or something?Martin MalandroWed, 25 Jul 2012 07:52:05 -0500http://ask.sagemath.org/question/9180/Number fields and memoryhttp://ask.sagemath.org/question/9167/number-fields-and-memory/Hi,
I'm intensively using number fields. More precisely, I have a matrix group for which the coefficients belong to some number field and I iterate through this group.
For degree beyond a certain threshold (~ degree 8), Sage starts to fill more and more memory and I don't understand why (there is no reason from algorithmic point of vue). Is there a way to improve memory usage ? How could you explain such behavior ?
Thanks,
VincentvdelecroixFri, 20 Jul 2012 18:28:13 -0500http://ask.sagemath.org/question/9167/Memory blowup 2http://ask.sagemath.org/question/8749/memory-blowup-2/This code in Sage 4.8 (with CPlex installed if that matters) uses a lot of memory. After around 36 million graphs, it is using around 21 Gigs:
for g in graphs.nauty_geng("11"):
ind_set = g.independent_set()
G-SageMon, 27 Feb 2012 10:38:14 -0600http://ask.sagemath.org/question/8749/memory leak when doing lots of ideal testshttp://ask.sagemath.org/question/8573/memory-leak-when-doing-lots-of-ideal-tests/I'm trying to find elliptic curves lying on a quartic surface; there are probably clever ways to do this, but at the moment I am finding lots of quadrics through an integer point on the surface, then iterating over pairs of them and checking
`x^4+y^4+z^4-67*t^4 in ideal(f1,f2)`
This leaks about 2MB memory per second, presumably because the Groebner bases for the ideals are being cached; is there a way to tell sage not to cache them?fivemackMon, 16 Jan 2012 23:59:06 -0600http://ask.sagemath.org/question/8573/