Ask Your Question

Generating a unique file name for each graph

asked 2013-04-19 05:52:38 +0200

updated 2013-04-19 05:53:44 +0200

I am writing a function whose input is a graph and some other data. I would like to store the computations that this program does each time it is run in a file with one file for each graph, so that, the next time it is run, it need not recalculate the calculations it has already done. In order to do this, I would like to assign a unique file name to each graph. I was thinking of using the hash function of the combinatorial object constructed from the graph to generate file names: CombinatorialObject(G).__hash__()

I have two questions:

  1. Will this always work (the function taking a graph to this hash is well-defined and injective)?

  2. Is there a better way to do this?

edit retag flag offensive close merge delete



I get the impression that `CombinatorialObject(G)` just captures the vertex set of `G`. Eg sage: CombinatorialObject(graphs.PetersenGraph()).__hash__() 4020364995504644057 sage: CombinatorialObject(graphs.CycleGraph(10)).__hash__() 4020364995504644057 Perhaps using a database would be an alternative for this? You can use the graph6 string of the canonically labelled graph for querys. I've used mongodb with sage and it works pretty well.

fidbc gravatar imagefidbc ( 2013-04-19 14:29:25 +0200 )edit

@fidelbc Thanks for the comments, and bringing MongoDB to my attention.

Amri gravatar imageAmri ( 2013-04-20 09:26:51 +0200 )edit

1 Answer

Sort by ยป oldest newest most voted

answered 2013-04-19 06:23:19 +0200

tmonteil gravatar image

updated 2013-04-19 06:28:46 +0200

The set of graphs Sage can produce is virtually infinite (depends on your RAM), whereas a hash is a uniformly bounded number. Hence there can not be an injection between them (see e.g. wikipedia).

However, a hash function is assumed to be quite "mixing", so that the probability to get a collision is close to the estimation given by the birthday paradox (see e.g. wikipedia). Hence, unless your graph database is very huge, you can consider that you won't get a collision (or that the number of useless recalculations will be very small).

That said, especially for simple graphs with less than 262143 vertices, there is a compact ASCII representation (which is injective), given by the method .graph6_string():

sage: G = graphs.PetersenGraph()
sage: G.graph6_string()

The hash method seems faster.

edit flag offensive delete link more


Thanks. This was both enlightening and useful.

Amri gravatar imageAmri ( 2013-04-19 06:36:57 +0200 )edit

It might be worth noting that the graph6_string depends on the labelling of the graph. sage: g=graphs.PetersenGraph(); h=graphs.PetersenGraph() sage: new_labels={0: 4, 1: 6, 2: 5, 3: 7, 4: 0, 5: 8, 6: 3, 7: 1, 8: 9, 9: 2};h.relabel(perm=new_labels) sage: h.is_isomorphic(g) True sage: g.graph6_string() 'IheA@GUAo' sage: h.graph6_string() 'IX`?{HGCW' It might better to get the g6 string of the graph after using the `canonical_label` method. sage: g.canonical_label().graph6_string() 'I@OZCMgs?' sage: h.canonical_label().graph6_string() 'I@OZCMgs?'

fidbc gravatar imagefidbc ( 2013-04-19 14:25:37 +0200 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools


Asked: 2013-04-19 05:52:38 +0200

Seen: 328 times

Last updated: Apr 19 '13