Ask Your Question
0

Make graph generation parallel?

asked 2014-08-15 21:09:24 +0200

Pelonza gravatar image

updated 2015-01-17 22:26:07 +0200

FrédéricC gravatar image

Hi All,

I understand that sage has a 'built-in' ability to make functions run in parallel, which is awesome. However the program I'm running is not being bogged down by my own functions, but rather some of the built in sage functions. Specifically, I'm using the built in generic graph generator to create large graphs for testing. This is inherently slow (massive combinatorics), however, I've got 6 cpu cores that are just sitting idle.

I realize I could "dummy" parallel by running a few instances for different graph sizes, but it seemed to me that the graph generator methods, which simply return the list of things for me to test on _should_ be able to be made parallel inherently. I don't care if the graphs are ordered (since I can always sort ones I keep later)

I was having a hard time actually finding WHERE in the sage source codes the BASIC generator/constructor was (I found the documentation, and sort of found code/documentation for some of the specific types of graphs).

To be clear, I'm running the line: sage: g=list(graphs(12) )

And I want "graphs(12)" to run in parallel. I am using some of the other arguments to reduce what I need to generate.

edit retag flag offensive close merge delete

Comments

@Pelonza , Ive been trying to figure out what gets called exactly when you enter say graphs(6) , but the only reference to that notation ive found so far is in /sage/src/sage/graphs/graph.py on line 338. But between that and the parallel computing manual: http://www.sagemath.org/pdf/en/reference/parallel/parallel.pdf , it doesnt seem to me at least that graphs(12) is something you can make parallel, without modifying some of the graphing algorithms. But being able to parallelize such an operation if its currently not possible, would be a good ticket idea as suggested.

Paul Graham gravatar imagePaul Graham ( 2014-08-16 11:27:18 +0200 )edit

@Paul I'll see about submitting a ticket on it. It probably is something parallelizable, or, as the answer below states, it might be something that by working on the generator actually is "parallelized" right now.

Pelonza gravatar imagePelonza ( 2014-08-18 00:10:09 +0200 )edit

2 Answers

Sort by » oldest newest most voted
1

answered 2014-08-16 09:00:58 +0200

rws gravatar image

Since you are asking for modification of existing Sage code the general answer is to open a ticket for your enhancement idea. However, in this case you might want to improve the code you use by taking advantage of the fact that graphs() returns a Python iterator/generator.

https://docs.python.org/2/howto/funct...

But, in order to help you with that we must know what your surrounding code is. In summary, please always give code examples.

edit flag offensive delete link more

Comments

After a bunch more investigation, it seems there might be a way to do slicing on the generator, then parallelize the slices, although, there might be something far more subtle underlying the "slicing"... I.e. when you use islice, does it still actually "USE" the generator? that is, if I sliced a 100 element list, into two parts, 1-50 and 51-100, does the sliced iterator on 51-100 still have to "call" 1-50, it just doesn't actually return the value to whatever is using the sliced iterator?

Pelonza gravatar imagePelonza ( 2014-08-18 01:27:56 +0200 )edit
0

answered 2014-08-18 00:09:00 +0200

Pelonza gravatar image

Happy to give my specific code, since I think the answer will help me understand generators/parallel structure better anyway...

Here's my testing function:

def is_kdense(g,k):
k_dense=True
if min(g.degree())< k-2:
    k_dense=False
if g.size()<floor(g.order()*(k-1)/2):
    k_dense=False
for i in list(g.edge_iterator(labels=False)):
    common= len(set(g.neighbors(i[0])).intersection(g.neighbors(i[1])))
    if common < k-2:
        k_dense=False
        break
return k_dense

This is being called from a main file by:

#generator for n=9
n9graphs=set()
for g in graphs.nauty_geng("9 -c 27:32 -d6"):
    gtmp=g.copy(immutable=True)
    if is_kdense(gtmp,7):
        n9graphs.add(gtmp) 
print "finished first-9"
for g in n9graphs:
    print g.adjacency_matrix()
    print "next graph"

So basically, the main file creates a generator with:

graphs.nauty_geng("9 -c 27:32 -d6")

Then, I could test each graph from the generator in parallel (without any conflict from my test function). However, I do still need a way of saving things that test true, which probably requires a lock around the

n9graphs.add(gtmp)
edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

Stats

Asked: 2014-08-15 21:09:24 +0200

Seen: 541 times

Last updated: Aug 18 '14