ASKSAGE: Sage Q&A Forum - Individual question feedhttp://ask.sagemath.org/questions/Q&A Forum for SageenCopyright Sage, 2010. Some rights reserved under creative commons license.Mon, 15 Aug 2011 01:52:06 -0500when/how to use parallel?http://ask.sagemath.org/question/8272/whenhow-to-use-parallel/So I've scouted around some of the questions posed here that involve the terms parallel or the decorator @parallel and I want to know: is there any documentation out there that introduces novices to parallel computing (in sage)?
At present (i.e. with the parallel? documentation being what it is), I don't really know how (or even when) to take advantage of @parallel.
If I have a dual-core CPU, should I be trying to decorate every function with @parallel? Can someone educate this fool?
Thanks,
Steven
EDIT:
---
I suppose the best answer here would give an example of situations where @parallel can be of use, and where it cannot. I'm going to throw two examples out there, someone let me know if this is correct. Feel free to add you own, if my examples are two superficial.
1. *Computing arc-length of a (smooth) curve parametrized by* $t \in [0,2\pi]$
var('t')
def calcArcLength(gamma):
speed_gamma = sqrt( sum( [ component.diff(t)^2 for component in gamma ]))
return numerical_integral(speed_gamma, 0, 2*pi)[0]
2. *Calculating the arc-lengths of a list of smooth curves*
@parallel
def makeList(length=1):
li = []
for i in range(length-1):
li.append(generateRandomCurve())
return li
@parallel
def processList(list_of_curves):
arclength_of_curve = []
for curve in list_of_curves:
arclength_of_curve.append(calcArcLength(curve))
return arclength_of_curve
Is it fair to say that @parallel should speed up the time to make and process list_of_curves? Where as it won't speed up the function numerical_integral?
2*pi)[0]2*pi)[0]2*pi)[0]Sat, 13 Aug 2011 18:17:47 -0500http://ask.sagemath.org/question/8272/whenhow-to-use-parallel/Answer by niles for <p>So I've scouted around some of the questions posed here that involve the terms parallel or the decorator @parallel and I want to know: is there any documentation out there that introduces novices to parallel computing (in sage)?</p>
<p>At present (i.e. with the parallel? documentation being what it is), I don't really know how (or even when) to take advantage of @parallel.</p>
<p>If I have a dual-core CPU, should I be trying to decorate every function with @parallel? Can someone educate this fool?</p>
<p>Thanks,</p>
<p>Steven</p>
<h2>EDIT: </h2>
<p>I suppose the best answer here would give an example of situations where @parallel can be of use, and where it cannot. I'm going to throw two examples out there, someone let me know if this is correct. Feel free to add you own, if my examples are two superficial.</p>
<ol>
<li><p><em>Computing arc-length of a (smooth) curve parametrized by</em> $t \in [0,2\pi]$</p>
<pre><code> var('t')
def calcArcLength(gamma):
speed_gamma = sqrt( sum( [ component.diff(t)^2 for component in gamma ]))
return numerical_integral(speed_gamma, 0, 2*pi)[0]
</code></pre></li>
<li><p><em>Calculating the arc-lengths of a list of smooth curves</em></p>
<pre><code>@parallel
def makeList(length=1):
li = []
for i in range(length-1):
li.append(generateRandomCurve())
return li
@parallel
def processList(list_of_curves):
arclength_of_curve = []
for curve in list_of_curves:
arclength_of_curve.append(calcArcLength(curve))
return arclength_of_curve
</code></pre></li>
</ol>
<p>Is it fair to say that @parallel should speed up the time to make and process list_of_curves? Where as it won't speed up the function numerical_integral?
2<em>pi)[0]2</em>pi)[0]2*pi)[0]</p>
http://ask.sagemath.org/question/8272/whenhow-to-use-parallel/?answer=12577#post-id-12577I think you might be misunderstanding what the parallel decorator does to a function (or at least not understanding it in the limited way I do!). As I understand it, the basic functionality of `@parallel` is to convert a function which takes a single input to one which takes a list of inputs, and runs the original function on each item in the list. The matter is slightly confused because the new "decorated" function has the same name as the original one, so from a user's point of view, it's hard to tell the difference between the two (and of course, that's the point of decorators :). The "parallel" functionality is really just doing each of those separate computations on separate processors.
`@parallel` doesn't really analyze or optimize the actual workings of your function. Thus, I don't think you need to write the extra "list processing" functions, but just decorate `calcArcLength` if you want to compute the arc lengths of a bunch of curves in parallel. And I think you are right that it won't speed up something like `numerical_integral` on just a single computation.
So, in answer to the question in the title, I would say use `@parallel` when you want to apply the same function to a long list of inputs -- write the function, and let `@parallel` distribute the separate function calls to separate processes.
---
I'm not sure if an example is necessary at this point, but I don't think there are enough `@parallel` examples, so I'll extend the one given by @benjaminfjones:
@parallel
def hard_computation(n,d=1,show_out=False):
for j in range(n):
sleep(.1)
if show_out:
print "factoring",n,"/",d,".."
return str(factor(floor(n/d)))
The function `hard_computation` just simulates some long calculation. The following is just it's basic functionality, and would work with or without the @parallel decorator:
sage: hard_computation(16,2,True)
factoring 16 / 2 ..
'2^3'
sage: %time hard_computation(12)
'2^2 * 3'
CPU time: 0.00 s, Wall time: 1.20 s
Note that, even though the `for` loop could be parallelized, `@parallel` doesn't speed this up. Here's what `@parallel` makes possible:
sage: r = hard_computation([2*n for n in range(3,10)]) #this is instantaneous
sage: r
<generator object __call__ at 0x10d559e10>
So `@parallel` just sets up a generator object. None of the `hard_computation` code is run until you start getting items from `r`. Here's what I do:
for x in r:
print x
Which returns:
(((6,), {}), '2 * 3')
(((8,), {}), '2^3')
(((10,), {}), '2 * 5')
(((12,), {}), '2^2 * 3')
(((14,), {}), '2 * 7')
(((16,), {}), '2^4')
(((18,), {}), '2 * 3^2')
In each line of output, `x[0]` holds the arguments for the computation, and `x[1]` holds the output. This is important because the order of the output is just the order that the processes finish in, not the order that they're called. For
r = hard_computation([2*n for n in range(3,10)])
for x in r:
print x
I get the following (I'm running this on a machine with two processors):
(((12,), {}), '2^2 * 3')
(((14,), {}), '2 * 7')
(((10,), {}), '2 * 5')
(((8,), {}), '2^3')
(((4,), {}), '2^2')
(((6,), {}), '2 * 3')
(((2,), {}), '2')
And lastly I'll just give an example of calling `hard_computation` with more inputs. Basically, you can give either a tuple of inputs, or a dict of keyword inputs, but not a combination of the two:
r = hard_computation([(2*n,3,True) for n in range(3,10)])
s = hard_computation([{'n':2*n,'d':3,'show_out':True} for n in range(3,10)])
for x in r:
print x
factoring 6 / 3 ..
(((6, 3, True), {}), '2')
factoring 8 / 3 ..
(((8, 3, True), {}), '2')
factoring 10 / 3 ..
(((10, 3, True), {}), '3')
factoring 12 / 3 ..
(((12, 3, True), {}), '2^2')
factoring 14 / 3 ..
(((14, 3, True), {}), '2^2')
factoring 16 / 3 ..
(((16, 3, True), {}), '5')
factoring 18 / 3 ..
(((18, 3, True), {}), '2 * 3')
Note the difference in `x[0]` here:
for x in s:
print x
factoring 6 / 3 ..
(((), {'show_out': True, 'd': 3, 'n': 6}), '2')
factoring 8 / 3 ..
(((), {'show_out': True, 'd': 3, 'n': 8}), '2')
factoring 10 / 3 ..
(((), {'show_out': True, 'd': 3, 'n': 10}), '3')
factoring 12 / 3 ..
(((), {'show_out': True, 'd': 3, 'n': 12}), '2^2')
factoring 14 / 3 ..
(((), {'show_out': True, 'd': 3, 'n': 14}), '2^2')
factoring 16 / 3 ..
(((), {'show_out': True, 'd': 3, 'n': 16}), '5')
factoring 18 / 3 ..
(((), {'show_out': True, 'd': 3, 'n': 18}), '2 * 3')
----
p.s. I think something like this should probably be included in the reference manual for `@parallel` (see ticket 11462). Suggestions for how to improve it would be welcome, either here or on the ticket page :)Sun, 14 Aug 2011 14:43:06 -0500http://ask.sagemath.org/question/8272/whenhow-to-use-parallel/?answer=12577#post-id-12577Comment by benjaminfjones for <div class="snippet"><p>I think you might be misunderstanding what the parallel decorator does to a function (or at least not understanding it in the limited way I do!). As I understand it, the basic functionality of <code>@parallel</code> is to convert a function which takes a single input to one which takes a list of inputs, and runs the original function on each item in the list. The matter is slightly confused because the new "decorated" function has the same name as the original one, so from a user's point of view, it's hard to tell the difference between the two (and of course, that's the point of decorators :). The "parallel" functionality is really just doing each of those separate computations on separate processors. </p>
<p><code>@parallel</code> doesn't really analyze or optimize the actual workings of your function. Thus, I don't think you need to write the extra "list processing" functions, but just decorate <code>calcArcLength</code> if you want to compute the arc lengths of a bunch of curves in parallel. And I think you are right that it won't speed up something like <code>numerical_integral</code> on just a single computation.</p>
<p>So, in answer to the question in the title, I would say use <code>@parallel</code> when you want to apply the same function to a long list of inputs -- write the function, and let <code>@parallel</code> distribute the separate function calls to separate processes.</p>
<hr/>
<p>I'm not sure if an example is necessary at this point, but I don't think there are enough <code>@parallel</code> examples, so I'll extend the one given by <a href="/users/259/benjaminfjones/">@benjaminfjones</a>:</p>
<pre><code>@parallel
def hard_computation(n,d=1,show_out=False):
for j in range(n):
sleep(.1)
if show_out:
print "factoring",n,"/",d,".."
return str(factor(floor(n/d)))
</code></pre>
<p>The function <code>hard_computation</code> just simulates some long calculation. The following is just it's basic functionality, and would work with or without the @parallel decorator:</p>
<pre><code>sage: hard_computation(16,2,True)
factoring 16 / 2 ..
'2^3'
sage: %time hard_computation(12)
'2^2 * 3'
CPU time: 0.00 s, Wall time: 1.20 s
</code></pre>
<p>Note that, even though the <code>for</code> loop could be parallelized, <code>@parallel</code> doesn't speed this up. Here's what <code>@parallel</code> makes possible:</p>
<pre><code>sage: r = hard_computation([2*n for n in range(3,10)]) #this is instantaneous
sage: r
<generator object __call__ at 0x10d559e10>
</code></pre>
<p>So <code>@parallel</code> just sets up a generator object. None of the <code>hard_computation</code> code is run until you start getting items from <code>r</code>. Here's what I do:</p>
<pre><code>for x in r:
print x
</code></pre>
<p>Which returns:</p>
<pre><code>(((6,), {}), '2 * 3')
(((8,), {}), '2^3')
(((10,), {}), '2 * 5')
(((12,), {}), '2^2 * 3')
(((14,), {}), '2 * 7')
(((16,), {}), '2^4')
(((18,), {}), '2 * 3^2')
</code></pre>
<p>In each line of output, <code>x[0]</code> holds the arguments for the computation, and <code>x[1]</code> holds the output. This is important because the order of the output is just the order that the processes finish in, not the order that they're called. For</p>
<pre><code>r = hard_computation([2*n for n in range(3,10)])
for x ...</code></pre></hr/><span class="expander"> <a>(more)</a></span></div>http://ask.sagemath.org/question/8272/whenhow-to-use-parallel/?comment=21369#post-id-21369I see you started #11462 to improve the parallel documentation (in case anyone else here wants to look at the Ticket and contribute).Sun, 14 Aug 2011 17:44:13 -0500http://ask.sagemath.org/question/8272/whenhow-to-use-parallel/?comment=21369#post-id-21369Comment by niles for <div class="snippet"><p>I think you might be misunderstanding what the parallel decorator does to a function (or at least not understanding it in the limited way I do!). As I understand it, the basic functionality of <code>@parallel</code> is to convert a function which takes a single input to one which takes a list of inputs, and runs the original function on each item in the list. The matter is slightly confused because the new "decorated" function has the same name as the original one, so from a user's point of view, it's hard to tell the difference between the two (and of course, that's the point of decorators :). The "parallel" functionality is really just doing each of those separate computations on separate processors. </p>
<p><code>@parallel</code> doesn't really analyze or optimize the actual workings of your function. Thus, I don't think you need to write the extra "list processing" functions, but just decorate <code>calcArcLength</code> if you want to compute the arc lengths of a bunch of curves in parallel. And I think you are right that it won't speed up something like <code>numerical_integral</code> on just a single computation.</p>
<p>So, in answer to the question in the title, I would say use <code>@parallel</code> when you want to apply the same function to a long list of inputs -- write the function, and let <code>@parallel</code> distribute the separate function calls to separate processes.</p>
<hr/>
<p>I'm not sure if an example is necessary at this point, but I don't think there are enough <code>@parallel</code> examples, so I'll extend the one given by <a href="/users/259/benjaminfjones/">@benjaminfjones</a>:</p>
<pre><code>@parallel
def hard_computation(n,d=1,show_out=False):
for j in range(n):
sleep(.1)
if show_out:
print "factoring",n,"/",d,".."
return str(factor(floor(n/d)))
</code></pre>
<p>The function <code>hard_computation</code> just simulates some long calculation. The following is just it's basic functionality, and would work with or without the @parallel decorator:</p>
<pre><code>sage: hard_computation(16,2,True)
factoring 16 / 2 ..
'2^3'
sage: %time hard_computation(12)
'2^2 * 3'
CPU time: 0.00 s, Wall time: 1.20 s
</code></pre>
<p>Note that, even though the <code>for</code> loop could be parallelized, <code>@parallel</code> doesn't speed this up. Here's what <code>@parallel</code> makes possible:</p>
<pre><code>sage: r = hard_computation([2*n for n in range(3,10)]) #this is instantaneous
sage: r
<generator object __call__ at 0x10d559e10>
</code></pre>
<p>So <code>@parallel</code> just sets up a generator object. None of the <code>hard_computation</code> code is run until you start getting items from <code>r</code>. Here's what I do:</p>
<pre><code>for x in r:
print x
</code></pre>
<p>Which returns:</p>
<pre><code>(((6,), {}), '2 * 3')
(((8,), {}), '2^3')
(((10,), {}), '2 * 5')
(((12,), {}), '2^2 * 3')
(((14,), {}), '2 * 7')
(((16,), {}), '2^4')
(((18,), {}), '2 * 3^2')
</code></pre>
<p>In each line of output, <code>x[0]</code> holds the arguments for the computation, and <code>x[1]</code> holds the output. This is important because the order of the output is just the order that the processes finish in, not the order that they're called. For</p>
<pre><code>r = hard_computation([2*n for n in range(3,10)])
for x ...</code></pre></hr/><span class="expander"> <a>(more)</a></span></div>http://ask.sagemath.org/question/8272/whenhow-to-use-parallel/?comment=21368#post-id-21368Ha, yeah I forgot to mention that; thanks :)Mon, 15 Aug 2011 01:52:06 -0500http://ask.sagemath.org/question/8272/whenhow-to-use-parallel/?comment=21368#post-id-21368Comment by benjaminfjones for <div class="snippet"><p>I think you might be misunderstanding what the parallel decorator does to a function (or at least not understanding it in the limited way I do!). As I understand it, the basic functionality of <code>@parallel</code> is to convert a function which takes a single input to one which takes a list of inputs, and runs the original function on each item in the list. The matter is slightly confused because the new "decorated" function has the same name as the original one, so from a user's point of view, it's hard to tell the difference between the two (and of course, that's the point of decorators :). The "parallel" functionality is really just doing each of those separate computations on separate processors. </p>
<p><code>@parallel</code> doesn't really analyze or optimize the actual workings of your function. Thus, I don't think you need to write the extra "list processing" functions, but just decorate <code>calcArcLength</code> if you want to compute the arc lengths of a bunch of curves in parallel. And I think you are right that it won't speed up something like <code>numerical_integral</code> on just a single computation.</p>
<p>So, in answer to the question in the title, I would say use <code>@parallel</code> when you want to apply the same function to a long list of inputs -- write the function, and let <code>@parallel</code> distribute the separate function calls to separate processes.</p>
<hr/>
<p>I'm not sure if an example is necessary at this point, but I don't think there are enough <code>@parallel</code> examples, so I'll extend the one given by <a href="/users/259/benjaminfjones/">@benjaminfjones</a>:</p>
<pre><code>@parallel
def hard_computation(n,d=1,show_out=False):
for j in range(n):
sleep(.1)
if show_out:
print "factoring",n,"/",d,".."
return str(factor(floor(n/d)))
</code></pre>
<p>The function <code>hard_computation</code> just simulates some long calculation. The following is just it's basic functionality, and would work with or without the @parallel decorator:</p>
<pre><code>sage: hard_computation(16,2,True)
factoring 16 / 2 ..
'2^3'
sage: %time hard_computation(12)
'2^2 * 3'
CPU time: 0.00 s, Wall time: 1.20 s
</code></pre>
<p>Note that, even though the <code>for</code> loop could be parallelized, <code>@parallel</code> doesn't speed this up. Here's what <code>@parallel</code> makes possible:</p>
<pre><code>sage: r = hard_computation([2*n for n in range(3,10)]) #this is instantaneous
sage: r
<generator object __call__ at 0x10d559e10>
</code></pre>
<p>So <code>@parallel</code> just sets up a generator object. None of the <code>hard_computation</code> code is run until you start getting items from <code>r</code>. Here's what I do:</p>
<pre><code>for x in r:
print x
</code></pre>
<p>Which returns:</p>
<pre><code>(((6,), {}), '2 * 3')
(((8,), {}), '2^3')
(((10,), {}), '2 * 5')
(((12,), {}), '2^2 * 3')
(((14,), {}), '2 * 7')
(((16,), {}), '2^4')
(((18,), {}), '2 * 3^2')
</code></pre>
<p>In each line of output, <code>x[0]</code> holds the arguments for the computation, and <code>x[1]</code> holds the output. This is important because the order of the output is just the order that the processes finish in, not the order that they're called. For</p>
<pre><code>r = hard_computation([2*n for n in range(3,10)])
for x ...</code></pre></hr/><span class="expander"> <a>(more)</a></span></div>http://ask.sagemath.org/question/8272/whenhow-to-use-parallel/?comment=21371#post-id-21371That's great. This should certainly be added to the manual. Sun, 14 Aug 2011 17:26:26 -0500http://ask.sagemath.org/question/8272/whenhow-to-use-parallel/?comment=21371#post-id-21371Answer by Eugene for <p>So I've scouted around some of the questions posed here that involve the terms parallel or the decorator @parallel and I want to know: is there any documentation out there that introduces novices to parallel computing (in sage)?</p>
<p>At present (i.e. with the parallel? documentation being what it is), I don't really know how (or even when) to take advantage of @parallel.</p>
<p>If I have a dual-core CPU, should I be trying to decorate every function with @parallel? Can someone educate this fool?</p>
<p>Thanks,</p>
<p>Steven</p>
<h2>EDIT: </h2>
<p>I suppose the best answer here would give an example of situations where @parallel can be of use, and where it cannot. I'm going to throw two examples out there, someone let me know if this is correct. Feel free to add you own, if my examples are two superficial.</p>
<ol>
<li><p><em>Computing arc-length of a (smooth) curve parametrized by</em> $t \in [0,2\pi]$</p>
<pre><code> var('t')
def calcArcLength(gamma):
speed_gamma = sqrt( sum( [ component.diff(t)^2 for component in gamma ]))
return numerical_integral(speed_gamma, 0, 2*pi)[0]
</code></pre></li>
<li><p><em>Calculating the arc-lengths of a list of smooth curves</em></p>
<pre><code>@parallel
def makeList(length=1):
li = []
for i in range(length-1):
li.append(generateRandomCurve())
return li
@parallel
def processList(list_of_curves):
arclength_of_curve = []
for curve in list_of_curves:
arclength_of_curve.append(calcArcLength(curve))
return arclength_of_curve
</code></pre></li>
</ol>
<p>Is it fair to say that @parallel should speed up the time to make and process list_of_curves? Where as it won't speed up the function numerical_integral?
2<em>pi)[0]2</em>pi)[0]2*pi)[0]</p>
http://ask.sagemath.org/question/8272/whenhow-to-use-parallel/?answer=12575#post-id-12575Well, I guess parallel now is something more like an optimization tool, not a thing for regular using in **every** function. Not every function requaries parallelism, even more - linear algorythms can not be executed in parallel.
So, if calculations speed is quite satisfactory for you, then you should not use parallel (if not just for fun).Sat, 13 Aug 2011 23:25:26 -0500http://ask.sagemath.org/question/8272/whenhow-to-use-parallel/?answer=12575#post-id-12575Comment by StevenPollack for <p>Well, I guess parallel now is something more like an optimization tool, not a thing for regular using in <strong>every</strong> function. Not every function requaries parallelism, even more - linear algorythms can not be executed in parallel.
So, if calculations speed is quite satisfactory for you, then you should not use parallel (if not just for fun).</p>
http://ask.sagemath.org/question/8272/whenhow-to-use-parallel/?comment=21374#post-id-21374Right. I'm not sure I know what a "linear algorithm" is. Let me edit my question, so I can be more specific.Sun, 14 Aug 2011 02:37:33 -0500http://ask.sagemath.org/question/8272/whenhow-to-use-parallel/?comment=21374#post-id-21374