Ask Your Question

Martin Malandro's profile - activity

2023-11-02 13:32:07 +0200 received badge  Famous Question (source)
2020-04-17 03:39:01 +0200 received badge  Notable Question (source)
2016-06-24 15:53:54 +0200 received badge  Notable Question (source)
2016-06-22 06:05:57 +0200 received badge  Student (source)
2016-06-18 06:01:59 +0200 received badge  Popular Question (source)
2016-01-02 17:33:22 +0200 received badge  Famous Question (source)
2015-03-06 03:32:32 +0200 received badge  Notable Question (source)
2014-12-06 09:21:57 +0200 received badge  Popular Question (source)
2014-04-29 09:18:41 +0200 received badge  Popular Question (source)
2013-03-06 22:15:31 +0200 asked a question calling a parallel decorated function on an iterator

Say I've decorated a function f with @parallel and then, instead of evaluating f at a list of inputs, I evaluate f at a generator (whose list of yields is meant to be interpreted as the list of inputs at which I want to evaluate f). Silly example:

def my_iterator(n):
    for i in xrange(n):
        yield i

@parallel(verbose=True)
def f(x):
    return x^2

for x in f(my_iterator(2^5)):
    print x

This works---Sage interprets the input "my_iterator(2^5)" to f correctly. However, replacing 2^5 with 2^30 shows that the way Sage goes about trying to compute this is by first building a list of the yields of my_iterator(2^30) and then trying to distribute the elements of that list to forked subprocesses. That is,

for x in f(my_iterator(2^30)):
    print x

is functionally identical to

for x in f([x for x in my_iterator(2^30)]):
    print x

which is horrible. Instead of starting to yield outputs immediately and using virtually no memory, Sage consumes all available memory (as it tries to build a list of length 2^30) and then the computation just dies. Even worse, when it dies it stops silently with no output, despite the "verbose=True" option.

When setting up lengthy parallelized computations, sometimes it makes sense to create the inputs to a parallelized function using a generator (either algorithmically or by, say, reading an input file), and it would be nice if we could start sending those inputs out via the @parallel mechanism for processing as soon as they're generated, instead of trying to create and store every input that will be evaluated in memory before the first forked subprocess begins. I guess I would want the interior generator ("my_iterator(30)" in the above example) to yield its next result to be sent to a forked subprocess whenever a core is available for that forked subprocess, and the parallel computation will run until the interior generator is out of things to yield and all subprocesses have returned.

So my question. Is there a simple workaround, modification of the source code, or alternative method for achieving this?

2013-03-06 21:02:54 +0200 marked best answer Can't thread with permutation groups?

The problem is that the GAP interface is not threadsafe -- both threads are accessing the same GAP session. You could use the @parallel decorator and forking (which has a bit of overhead). For example,

def method_1(n):
    import time
    time.sleep(20)
    return 1, Integer(gap(n)*n)

def method_2(n):
    return 2, Integer(gap(n)*n)

def try_fast(n):
    @parallel
    def try_method(method):
        return method(n)

    method_num, result = try_method([method_1, method_2]).next()[1]
    print "Computed by method %s"%method_num
    return result

The result of the (decorated) try_method function is an iterator which will yield the results as soon as they are ready. In this case, we just need the first one.

 sage: %time try_fast(4)
 Killing any remaining workers...
 Computed by method 2
 CPU times: user 0.00 s, sys: 0.02 s, total: 0.02 s
 Wall time: 0.27 s
 16
2013-03-06 21:02:54 +0200 received badge  Scholar (source)
2012-08-28 23:03:08 +0200 commented answer Can't thread with permutation groups?

This works, but when I run it in a loop I get the "Killing any remaining workers" message each time try_fast runs. If I go this route, can I expect Sage to kill the slower workers stably in general?

2012-08-24 19:12:10 +0200 asked a question Can't thread with permutation groups?

I'm having trouble running threads involving permutation groups. Here is a small example that shows the issue on Sage 5.2 (and also Sage 4.8).

import time
from threading import Thread

class f(Thread):
    def __init__(self,val):
        Thread.__init__(self)
        self.val=val
    def run(self):
        G=CyclicPermutationGroup(self.val)
        print 'here'
        print G

a=f(4)
b=f(4)
a.start()
b.start()
a.join()
b.join()

The "here"s print but then the worksheet (usually) just keeps running without doing anything and refuses to be interrupted, or (sometimes) crashes and offers error messages such as

Exception in thread Thread-2:
Traceback (most recent call last):
  File "/sagenb/sage_install/sage-5.2-sage.math.washington.edu-x86_64-Linux/local/lib/python/threading.py", line 551, in __bootstrap_inner
    self.run()
  File "/tmp/tmp4JA7Av/___code___.py", line 14, in run
    print G
  File "sage_object.pyx", line 154, in sage.structure.sage_object.SageObject.__repr__ (sage/structure/sage_object.c:1753)
  File "/sagenb/sage_install/sage-5.2-sage.math.washington.edu-x86_64-Linux/local/lib/python2.7/site-packages/sage/groups/perm_gps/permgroup_named.py", line 436, in _repr_
    return "Cyclic group of order %s as a permutation group"%self.order()
  File "/sagenb/sage_install/sage-5.2-sage.math.washington.edu-x86_64-Linux/local/lib/python2.7/site-packages/sage/groups/perm_gps/permgroup.py", line 1400, in order
    if not self.gens() or self.gens() == [self(1)]:
  File "/sagenb/sage_install/sage-5.2-sage.math.washington.edu-x86_64-Linux/local/lib/python2.7/site-packages/sage/groups/perm_gps/permgroup.py", line 646, in __call__
    return self.identity()
  File "/sagenb/sage_install/sage-5.2-sage.math.washington.edu-x86_64-Linux/local/lib/python2.7/site-packages/sage/groups/perm_gps/permgroup.py", line 902, in identity
    return self._element_class()([], self, check=True)
  File "permgroup_element.pyx", line 452, in sage.groups.perm_gps.permgroup_element.PermutationGroupElement.__init__ (sage/groups/perm_gps permgroup_element.c:4337)
  File "sage_object.pyx", line 463, in sage.structure.sage_object.SageObject._gap_ (sage/structure/sage_object.c:4518)
  File "sage_object.pyx", line 439, in sage.structure.sage_object.SageObject._interface_ (sage/structure/sage_object.c:4118)
  File "/sagenb/sage_install/sage-5.2-sage.math.washington.edu-x86_64-Linux/local/lib/python2.7/site-packages/sage/interfaces/interface.py", line 198, in __call__
    return cls(self, x, name=name)
  File "/sagenb/sage_install/sage-5.2-sage.math.washington.edu-x86_64-Linux/local/lib/python2.7/site-packages/sage/interfaces/expect.py", line 1328, in __init__
    raise TypeError, x
TypeError: Gap produced error output
Syntax error: ; expected
$sage2:=Group([PermList([2, 3, 4, 1])]);;
 ^

   executing $sage2:=Group([PermList([2, 3, 4, 1])]);;

I've gotten other error messages as well, but nothing I can reliably reproduce.

If you delete the start and join lines for a or b in the code, it runs fine, of course.

Any ideas here? In case you're curious, the reason I'm trying to thread is this. I have a bunch of problems that can be solved by either of two different methods. For any given problem, one method is usually much faster than the other, but it's difficult to predict in advance which one will be the faster one, so for a given problem I'd like ... (more)

2012-07-25 17:13:14 +0200 commented question Runaway memory usage in Sage 5.0?

I'm not sure I could post a relevant portion because I'm not sure where the issue is. The code itself is kind of long. I would be happy to send the code (and the relevant input files) to anyone willing to take a look at it.

2012-07-25 14:52:05 +0200 asked a question Runaway memory usage in Sage 5.0?

Hi,

I am running Sage 5.0 on Windows 7 (as it is the latest Windows version available) and my code is crashing after a couple of hours of computation. Downgrading to Sage 4.8 fixes the problem. I'm not sure exactly where the issue is so I will try to say as much about what I'm doing as possible.

I am using the algorithm described in this paper:

http://www.springerlink.com/content/1...

to build a database of the lattices of order $n$ up to isomorphism. I am up to $n=12$ so far, and my goal is to reach $n=15$. The program works by generating the lattices of order $n+1$ from the lattices of order $n$.

As such, I am using lots of Posets and LatticePosets. Sage should not have to store in memory more than a thousand or so Posets on $\leq 15$ nodes at any point during the code's execution, and should not have to hold much else in memory beyond these posets. My code takes as input the lattices of order $n$ and writes the lattices of order $n+1$ as it generates them to a file. I am running Sage 5.0 in VirtualBox with 4 processors and 1500MB RAM allocated.

My code uses the @parallel decorator on one function. With this, the overall memory usage of my system climbs rapidly from what it was before (X) to X+1500MB, and after a few hours one of the return values from the parallelized function will be 'NO DATA' (instead of what I expected, which is a short list of posets), which tells me something went wrong. If I remove the @parallel decorator and just call my function with single inputs instead of lists of inputs, the memory usage of my system rises rapidly to X+1500MB and after a few hours the entire Sage virtual machine just shuts down.

However, if I downgrade to Sage 4.8, dedicate 4 processors and only 1250MB RAM to Virtualbox, I can use the @parallel decorator and my code will run stably for hours and eventually complete, without my system ever going over X+1000MB memory usage.

Does anyone have any idea what's going on here? Is Sage 5.0 caching all of the lattices of order $n+1$ that I'm generating along the way and eventually running out of memory or something?