# perfomance: GAP code vs SAGE code

I have a heavy procedure in GAP. I want to speed it up.

Is it a good idea to rewrite as much as I can on SAGE and use GAP only where It is needed?

For example rewrite parts that works with lists, with iterators and with files.

Or may be GAP realization of such object is faster (because of language C) than SAGE version.

P.S. Of course I hate GAP language and I want to write on SAGE instead. But performance of calculations is very important for me.

edit retag close merge delete

Sort by » oldest newest most voted

The GAP<->Sage interface is rather slow. LibGAP will make it much faster at one point, but is not finished yet. So as a general strategy, you should do as much as you can just in Sage or just in GAP.

GAP is very fast for group theoretic computations. But working with lists, iterators, and files will always be beaten by Sage/Cython simply because its compiled code. So, well, it depends on what exactly your problem is.

more

Thanks for answer. Please, tell me how to use cython? About pure python SAGE I got not good benchmarks (see my answer on this question).

( 2011-10-26 10:37:16 +0200 )edit
( 2011-10-27 13:31:10 +0200 )edit

Your code is mostly a div/mod benchmark if you run it in the profiler. For example, the following is ~200 times faster. Apart from the obvious use of machine ints, note that appending to a list is much faster than prepending:

%cython
cimport cython
cdef generate_subgroup_list_by_list_of_subgroups_and_number(subgroups, int number):
cdef int number_subgroups = len(subgroups)
subgroups_list = []
cdef int i, exp2, factor
for i in range(number_subgroups-1, -1, -1):
exp2 = 2<<i
factor = cython.cdiv(number, exp2)
number = cython.cmod(number, exp2)
if factor == 1:
# subgroups_list = [subgroups[i]] + subgroups_list
subgroups_list.append(subgroups[i])
return subgroups_list

def dostuff():
array = range(1,11)
for i in range(1,101):
generate_subgroup_list_by_list_of_subgroups_and_number(array, i)

more

In my humble experience GAP is rather fast, and I doubt migrating code to sage will speed it up (unless sage has better algorithms for the tasks).

Also take into account that calling to GAP from within sage is slow: http://ask.sagemath.org/question/477/...

more

Hi there!

Concerning "slowness of the interface": That ist a general problem for the pexpect-interfaces. The idea of these interfaces is that Sage pretends to be a user who is interactively working with a GAP/Singular/Magma/... session. So, Sage sends strings that GAP/Singular/Magma interpretes as commands; then Sage waits for the result; when they appear, the strings printed by GAP/Singular/Magma are transferred back to Sage and interpreted.

Of course, this has one big disadvantage: Communication with a sub-process by ascii strings and parsing of printed results has a huge overhead. Therefore, when using such pexpect interfaces, it is sometimes needed to do, for example, a loop in Singular, even though a loop in Sage (using Cython, at least) is much faster.

I give you some examples using the Singular pexpect interface:

sage: P.<x,y,z> = QQ[]
sage: sP = singular(P)
sage: M = singular.matrix(100,100,0)
sage: def test1(M):
....:     for i in range(1,101):
....:         for j in range(1,101):
....:             M[i,j] = singular(i)*singular(j)
....:
sage: def test2(M):
....:     for i in range(1,101):
....:         for j in range(1,101):
....:             M[i,j] = i*j
....:
sage: def test3(M):
....:     singular.eval('int i')
....:     singular.eval('int j')
....:     singular.eval('for (i=1;i<=100;i++){for (j=1;j<=100;j++){%s[i,j]=i*j;}}'%M.name())
....:


The three functions are equivalent:

sage: M = singular.matrix(100,100,0)
sage: test1(M)
sage: N = singular.matrix(100,100,0)
sage: test2(N)
sage: O = singular.matrix(100,100,0)
sage: test3(O)
sage: list(M)==list(N)==list(O)
True


However the running times are vastly different, since the first two functions use the interface very inefficiently:

sage: M = singular.matrix(100,100,0)
sage: %time test1(M)
CPU times: user 8.50 s, sys: 1.96 s, total: 10.46 s
Wall time: 10.57 s
sage: M = singular.matrix(100,100,0)
sage: %time test2(M)
CPU times: user 4.44 s, sys: 1.06 s, total: 5.50 s
Wall time: 5.65 s
sage: M = singular.matrix(100,100,0)
sage: %time test3(M)
CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
Wall time: 0.14 s


The overhead of a pexpect interface can be completely avoided by using a C library instead. In the case of Singular, such library is available: libSingular. But, as Volker said, LibGAP is not totally ready, yet.

After stating so many negative things about pexpect, you might wonder why it exists. In fact, the pexpect approach has at least two advantages:

• The same interface (after modifying certain details: "What is the prompt?", "How is equality tested?" and so on) can be used for interacting with any command line program. In particular, we can offer a pexpect interface to Magma and Maple, but it would be illegal to include them in Sage as C libraries
more

My previous example exposed a case in which it is better to do low-level stuff inside Singular/GAP/Magma/... sub-processes.

I guess I also owe you an example that shows how much faster Python and Cython are. Consider an almost empty loop:

sage: singular.eval('int i')
'int i;'
sage: singular.eval('int j')
'int j;'
sage: def test1():
....:     singular.eval('for (i=1;i<=100000;i++){j=i;}')
....:
sage: def test2():
....:     for i in range(1,100001):
....:         j = i
....:
sage: def test3():
....:     for i in xrange(1,100001):
....:         j = i
....:
sage: cython("""
....: def test4():
....:     cdef int i
....:     cdef int j
....:     for i from 1<=i<=100000:
....:         j = i
....: """)
sage: %timeit test1()
5 loops, best of 3: 445 ms per loop
sage: %timeit test2()
125 loops, best of 3: 4.74 ms per loop
sage: %timeit test3()
125 loops, best of 3: 3.27 ms per loop
sage: %timeit test4()
625 loops, best of 3: 126 ns per loop


You see, if such basic things are concerned, Python is much faster than Singular (I don't know how fast GAP is), and Cython much faster than Python.

more

You can use the normal range() syntax for cython loops:"for i in range(1,100001):" will produce the same code as "for i from 1<=i<=100000". This was'nt the case in older Cython releases, but nowadays there is no difference. Also, the C compiler will just optimize out the empty loop. So your cython example actually does'nt loop over anything, it only measures the overhead of calling a C function.

( 2011-10-27 18:21:11 +0200 )edit

Ouch, I sometimes forget how clever Cython is. I knew that Cython doesn't loop if one has for i in range(N): pass, but I thought that the assignement j=i would prevent it!

( 2011-10-28 10:24:59 +0200 )edit

For the test I wrote such code in GAP

GenerateSubgroupListByListOfSubgroupsAndNumber := function (subgroups, number)
local number_subgroups, subgroups_list, factor, i;

number_subgroups := Size(subgroups);
subgroups_list := [];

for i in Reversed([0..number_subgroups-1]) do
factor := Int(number / 2^i);
number := RemInt(number, 2^i);

if factor = 1 then
fi;
od;

return subgroups_list;
end;


And SAGE equivalent

 def generate_subgroup_list_by_list_of_subgroups_and_number(subgroups, number):
number_subgroups = len(subgroups);
subgroups_list = []

for i in reversed(range(0, number_subgroups)):
factor, number = int( number / 2^i ), number % 2^i

if factor == 1:
subgroups_list = [subgroups[i]] + subgroups_list

return subgroups_list


I think that code above is a good example of working with lists, numbers and etc.

And I ran following benchmarks:

GAP:

array := [1..10];
for n in [1..10000] do
for i in [1..100] do
GenerateSubgroupListByListOfSubgroupsAndNumber(array, i);
od;
od;


Result:

$time echo "Read(\"gap_benchmark.g\"); quit;" | gap -b -q -T real 0m7.651s user 0m7.520s sys 0m0.056s  SAGE: array = [1..10] for n in [1..10000]: for i in [1..100]: generate_subgroup_list_by_list_of_subgroups_and_number(array, i)  Result: $ time sage sage_benchmark.sage

real    0m50.127s
user    0m49.707s
sys 0m0.268s


So, 7 seconds of GAP work vs 50 second of SAGE.

I will try to use cython with SAGE and will update this answer.

more