Ask Your Question
1

Performance issues with parallel decoration

asked 2015-03-02 19:04:53 -0600

ikol gravatar image

updated 2015-03-02 20:45:39 -0600

calc314 gravatar image

Experimenting with @parallel resulted in unexpected performance issues in Sage 6.4.1. Here is a very simple example:

@parallel(p_iter='multiprocessing', ncpus=6)
def f(n):
    return factor(n)
t=walltime()
r = range(1,1000000)
p = sorted(list( f(r)))
print walltime(t)
82.0724880695

t=walltime()
for i in range(1,1000000):
    factor(i)
print walltime(t)
12.1648099422

I have 6 physical cores, yet the serial calculation runs more than 6 times faster, even though I can see 6 instances of python running on my computer. Maybe it is pilot error, I have the following questions: 1) Does Sage require a special way of compiling it in order to take full advantage of @parallel? 2) In this case using 'fork' is even worse, it never completes the calculation. 3) How does @parallel distribute the calculations? Since, in general, it takes significantly longer for factor() to process larger numbers, it seems that assigning the case n=1,7,13,... to core_0, n=2,8,14,... to core_1, etc., makes sense. Shuffling the original serial list given to f(n) also seems plausible. However, dividing the whole serial range to 6 intervals and assigning them to the 6 cores, respectively, would be a bad choice and for most of the time only one or two python processes would do anything. Does anyone know what scheme is used in Sage?

Thanks for any suggestions.

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
0

answered 2015-03-04 04:11:33 -0600

First of all you cheat a bit since you create a list and sort it in the first case which costs some time

sage: %timeit l = range(1000000)
100 loops, best of 3: 14 ms per loop

If the time to execute once the function f is very small then it seems that you do not have any gain in using parallelization! Strange... If instead you factor in a much higher range (like numbers between $2^{128}$ and $2^{128} + 1000$) then you will see a gain.

To have a look at the source code you can do

sage: parallel??

You will see that it uses the class Parallel. I did not know where this class belongs. One way to obtain that is

sage: import_statements('Parallel')
from sage.parallel.decorate import Parallel

Then you can have a look at the code again

sage: from sage.parallel.decorate import Parallel
sage: Parallel??

and then continue the introspection this way. You can also have a look directly in the source code which in that case belongs to $SAGE_ROOT/src/sage/parallel/*

Vincent

edit flag offensive delete link more

Comments

Thanks, Vincent. Of course, you are right this is not an exact apples to apples comparison and your point about the calculation being too fast is valid. I too experimented with very large numbers and indeed parallel performance is better. Nonetheless I do see 6 python jobs starting out but very quickly four of them finish and only two and then only one is running for quite a while which means that load balancing is far from ideal. I'll check to source code to see how the list is passed to the function.

Thanks again,

Istvan

ikol gravatar imageikol ( 2015-03-04 14:15:56 -0600 )edit

In version 7.2 I no longer see the load balancing issue.

Istvan

ikol gravatar imageikol ( 2016-06-17 21:19:21 -0600 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2015-03-02 19:04:53 -0600

Seen: 179 times

Last updated: Mar 04 '15