Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Computation gets killed every couple of hours

I'm running the following computation (in parallel for different range of curves) but sage keps killing my computation every couple of hours. For the first few curves, it does fine and then when computation time takes something in the range of >3000s (~1hr), it will suddenly kill the computation without any warning. This had led to a lot of frustration because there is no way for me to tell what is actually going wrong.

We initially thought it was a memory leak issue and hence we added in gc.enable and this only helped for the simpler computations. It still didn't seem to make any difference for the ones that would have failed in the case where gc.enable was not added in previously.

I can understand if the computations may take a really long because the computation for the L-series is exponential (?), but that doesn't seem to be a reason for the computation to get killed. Also, when I start again at the last prime p that the computation gets killed at, sometimes it works and the computation carries on but gets killed later, sometimes it doesn't work at all.

Could anybody potentially help to remedy this or point me in the direction of how to make the code better? Thanks.

curves_list= ['30502b1', '30503b1', '30518c1', '30518d1', '30519f1', '30520a1', '30525w1', '30525x1', '30525bb1', '30530c1', '30530f1', '30534a1', '30535a1', '30535c1', '30537i1', '30537l1', '30544f1', '30544k1', '30550b1', '30550e1', '30550n1', '30550q1', '30550v1', '30550y1', '30558a1', '30558d1', '30564b1', '30564c1', '30564l1', '30565b1', '30566b1', '30573d1', '30575c1', '30576b1', '30576u1', '30576bf1', '30576bz1', '30576cb1', '30576cr1', '30576cs1', '30585c1', '30589b1', '30589d1', '30594a1', '30594b1', '30594d1']

Step 3: find the last curve you computed data for

import gc gc.enable() i = curves_list.index('30519f1') #this is the curve for which it stopped

Step 4: put in the [i:] in the next line

for x in curves_list[i:]: ##put in [i:] a = gc.collect() E = EllipticCurve(x) print E.cremona_label() sys.stdout.flush()

Step 5: now add this as a case to pick up where you left off

if x == '30519f1':
    ##add in the starting prime
    for p in prime_range(224,1000):
        if E.is_good(p) and E.is_ordinary(p):
            t1 = cputime()
            output = E.sha().p_primary_bound(p)
            print 'memory usage: ', get_memory_usage()
            a=gc.collect()
            t2 = cputime()
            a=gc.collect()
            print 'bound at p=%s is %s'%(p,output)
            sys.stdout.flush()
            print 'memory usage: ', get_memory_usage()
            sys.stdout.flush()
            a=gc.collect()
            if output > 0:
                print 'BOUND IS > 0 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!'
                sys.stdout.flush()
            print 'time to compute: ', t2 - t1
            sys.stdout.flush()
            a=gc.collect()

Step 6: add in an else and the code below

else:
    time_count_E_start = cputime()
    for p in prime_range(5,1000):
        if E.is_good(p) and E.is_ordinary(p):
            t1 = cputime()
            output = E.sha().p_primary_bound(p)
            print 'memory usage: ', get_memory_usage()
            a=gc.collect()
            t2 = cputime()
            a=gc.collect()
            print 'bound at p=%s is %s'%(p,output)
            sys.stdout.flush()
            print 'memory usage: ', get_memory_usage()
            sys.stdout.flush()
            a=gc.collect()
            if output > 0:
                print 'BOUND IS > 0 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!'
                sys.stdout.flush()
            print 'time to compute: ', t2 - t1
            sys.stdout.flush()
            a=gc.collect()
    a=gc.collect()
    time_count_E_end = cputime()
    print 'total time to compute for %s: %s'%(E.cremona_label(), time_count_E_end - time_count_E_start)
print 30*'-'
a=gc.collect()

Computation gets killed every couple of hours

I'm running the following computation (in parallel for different range of curves) but sage keps killing my computation every couple of hours. For the first few curves, it does fine and then when computation time takes something in the range of >3000s (~1hr), it will suddenly kill the computation without any warning. This had led to a lot of frustration because there is no way for me to tell what is actually going wrong.

We initially thought it was a memory leak issue and hence we added in gc.enable and this only helped for the simpler computations. It still didn't seem to make any difference for the ones that would have failed in the case where gc.enable was not added in previously.

I can understand if the computations may take a really long because the computation for the L-series is exponential (?), but that doesn't seem to be a reason for the computation to get killed. Also, when I start again at the last prime p that the computation gets killed at, sometimes it works and the computation carries on but gets killed later, sometimes it doesn't work at all.

Could anybody potentially help to remedy this or point me in the direction of how to make the code better? Thanks.

curves_list= ['30502b1', '30503b1', '30518c1', '30518d1', '30519f1', '30520a1', '30525w1', '30525x1', '30525bb1', '30530c1', '30530f1', '30534a1', '30535a1', '30535c1', '30537i1', '30537l1', '30544f1', '30544k1', '30550b1', '30550e1', '30550n1', '30550q1', '30550v1', '30550y1', '30558a1', '30558d1', '30564b1', '30564c1', '30564l1', '30565b1', '30566b1', '30573d1', '30575c1', '30576b1', '30576u1', '30576bf1', '30576bz1', '30576cb1', '30576cr1', '30576cs1', '30585c1', '30589b1', '30589d1', '30594a1', '30594b1', '30594d1']

Step 3: find the last curve you computed data for

import gc gc.enable() i = curves_list.index('30519f1') #this is the curve for which it stopped

Step 4: put in the [i:] in the next line

for x in curves_list[i:]: ##put in [i:] a = gc.collect() E = EllipticCurve(x) print E.cremona_label() sys.stdout.flush()

Step 5: now add this as a case to pick up where you left off

if x == '30519f1':
    ##add in the starting prime
    for p in prime_range(224,1000):
        if E.is_good(p) and E.is_ordinary(p):
            t1 = cputime()
            output = E.sha().p_primary_bound(p)
            print 'memory usage: ', get_memory_usage()
            a=gc.collect()
            t2 = cputime()
            a=gc.collect()
            print 'bound at p=%s is %s'%(p,output)
            sys.stdout.flush()
            print 'memory usage: ', get_memory_usage()
            sys.stdout.flush()
            a=gc.collect()
            if output > 0:
                print 'BOUND IS > 0 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!'
                sys.stdout.flush()
            print 'time to compute: ', t2 - t1
            sys.stdout.flush()
            a=gc.collect()

Step 6: add in an else and the code below

else:
    time_count_E_start = cputime()
    for p in prime_range(5,1000):
        if E.is_good(p) and E.is_ordinary(p):
            t1 = cputime()
            output = E.sha().p_primary_bound(p)
            print 'memory usage: ', get_memory_usage()
            a=gc.collect()
            t2 = cputime()
            a=gc.collect()
            print 'bound at p=%s is %s'%(p,output)
            sys.stdout.flush()
            print 'memory usage: ', get_memory_usage()
            sys.stdout.flush()
            a=gc.collect()
            if output > 0:
                print 'BOUND IS > 0 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!'
                sys.stdout.flush()
            print 'time to compute: ', t2 - t1
            sys.stdout.flush()
            a=gc.collect()
    a=gc.collect()
    time_count_E_end = cputime()
    print 'total time to compute for %s: %s'%(E.cremona_label(), time_count_E_end - time_count_E_start)
print 30*'-'
a=gc.collect()

Also, if it helps, this is the error message that I keep on getting:

/usr/local/sage/sage-6.2.rc2/local/bin/sage-python: line 2: 6182 Killed sage -python "$@"