Generate Maximal subsets based on mutual/subset property

1

asked 11 years ago

rickhg12hs
136 ●6 ●8 ●15

updated 2 years ago

tmonteil
27493 ●32 ●205 ●519 http://wiki.sagemath.o...

Sorry for the imprecise language - I'm not a mathematician.

Given a list/set/dict/etc., is there a function/method available that will generate subsets, with the possibility of non-null intersections, based on some user defined mutual/subset property?

E.g., given a list [0,1,2,3,4,5], generate the maximal lists such that the maximum difference within each list is 3. For the given list, the function would produce [[0,1,2,3],[1,2,3,4],[2,3,4,5]]. For the list [0,3,9,10,17,30,33], this function would produce [[0,3],[9,10],[17],[30,33]].

This is different than the usual filtering operation in that the defining property is about the returned lists/sets/etc., not just about the individual elements.

Function should work with members of ZZ, QQ, RR, RDF, CC, CDF, and others if possible.

add a comment

2

answered 11 years ago

tmonteil
27493 ●32 ●205 ●519 http://wiki.sagemath.o...

updated 11 years ago

You can also notice that your property defining the set B is a property that is tested on pairs of elements of A. Hence, you can construct the graph corresponding to admissible pairs:

sage: A = Set([0,3,9,10,17,30,33])
sage: G = Graph()
sage: G.add_vertices(A)
sage: for u,v in CartesianProduct(A,A):
....:    if u != v and max(u,v) - min(u,v) <=3:
....:        G.add_edge(u,v)

What you are looking for is the set of maximal cliques of G:

sage: G.cliques_maximal()
[[0, 3], [9, 10], [17], [33, 30]]

I let you benchmark between the two methods and tell us which is better in which cases !

To understand the link between the cliques of the graph G and the simplicial complex C in my previous answer, you can read about "Rips complex".

EDIT : This method works, but there is a Sage bug when the set is made of RDF elements (see the following discussion and trac ticket 14853). Here is a dirty workaround until the bug is fixed: we replace a RDF element by a tuple made of a single RDF element.

sage: A = Set([RDF(0),RDF(3),RDF(9),RDF(10),RDF(17),RDF(30),RDF(33)])
sage: Atuple = map(lambda x:(x,), A)
sage: G = Graph()
sage: G.add_vertices(Atuple)
sage: for u,v in CartesianProduct(Atuple, Atuple):
....:    if u != v and max(u[0],v[0]) - min(u[0],v[0]) <=3:
....:        G.add_edge(u,v)

Then,

sage: C = [[c[0] for c in clique] for clique in G.cliques_maximal()] ; C
[[0.0, 3.0], [9.0, 10.0], [17.0], [33.0, 30.0]]

EDIT : The bug has been fixed in sage 5.12, so you can now use this method as well without having to use the workaround, even with vertices in RDF.

link

Comments

This does not seem to work with RDF members of Set A.

rickhg12hs ( 11 years ago )

It seems to work for me with RDF entries: sage: A = Set([RDF(0),RDF(3),RDF(9),RDF(10),RDF(17),RDF(30),RDF(33)]) sage: G = Graph() sage: G.add_vertices(A) sage: G Graph on 7 vertices sage: G.vertices() [0, 3, 9, 10.0, 17.0, 30.0, 33.0] sage: for u,v in CartesianProduct(A,A): ....: if u != v and max(u,v) - min(u,v) <=3: ....: G.add_edge(u,v) ....: sage: G.cliques_maximal() [[0, 3], [9, 10.0], [17.0], [33.0, 30.0]] Could you give an example where it fails ?

tmonteil ( 11 years ago )

Sorry, should have been more explicit. Try: `A=Set([RDF.random_element(min=0,max=10) for k in range(10)])`

rickhg12hs ( 11 years ago )

Oh! You are right, there seems to be a problem in the construction of graphs: sage: A=Set([RDF.random_element(min=0,max=10) for k in range(10)]) ; A {6.42320967152, 1.77698693175, 2.95922392964, 9.50745089775, 4.60546838289, 3.67300191731, 5.21254750195, 5.90579400282, 7.55309974188, 0.442799267782} sage: G = Graph() sage: G.add_vertices(A) sage: G.vertices() [0, 1, 2, 3, 4, 5, 6, 7, 9] Nathann i am lost, could you explain that behaviour (and help us with a workaround) ?

tmonteil ( 11 years ago )

1) That's clearly a bug 2) I have no idea where it comes from, but probably from Robert Miller's attempt to avoid label if it is possible 3) I have no way to receive emails when comments are added to a question on ASK Sage (I never received any mail from AskSage in general) so asking me questions there is bound to fail :-P

Nathann ( 11 years ago )

see more comments

2

answered 11 years ago

tmonteil
27493 ●32 ●205 ●519 http://wiki.sagemath.o...

updated 11 years ago

First, if your want to filter (non-necessary maximal) subsets with a condition that implies testing all subsets of your given set A, you can try something like:

sage: A = Set([0,1,2,3,4,5])
sage: B = Set(a for a in A.subsets() if len(a) != 0 and max(a) - min(a) <= 3)

Note that i had to exclude the empty set to be able to pick the minimal and the maximal element of the iterated subsets of A. Note that i hd to use a Sage Set and not a python set to be able to get its subsets.

Then, you could notice that your property is stable by subsets, so you have more than a set of subsets, you have a simplicial complex:

sage: SimplicialComplex(B)
Simplicial complex with vertex set (0, 1, 2, 3, 4, 5) and facets {(1, 2, 3, 4), (2, 3, 4, 5), (0, 1, 2, 3)}

From which you can get maximal faces:

sage: C = SimplicialComplex(B).maximal_faces() ; C
{(1, 2, 3, 4), (2, 3, 4, 5), (0, 1, 2, 3)}

In your second example, you get:

sage: A = Set([0,3,9,10,17,30,33])
sage: B = Set(a for a in A.subsets() if len(a) != 0 and max(a) - min(a) <= 3)
sage: C = SimplicialComplex(B).maximal_faces() ; C
{(9, 10), (17,), (30, 33), (0, 3)}

link

Comments

This also works with RDF members of set A. Important to me.

rickhg12hs ( 11 years ago )

Selecting this as the best answer for the examples given, mostly for its relative simplicity. Thanks!

rickhg12hs ( 11 years ago )

Changed my mind about preferred solution after doing some performance testing. Determining maximal cliques seems to be *** orders of magnitude *** faster.

rickhg12hs ( 11 years ago )

add a comment

Generate Maximal subsets based on mutual/subset property

2 Answers

Comments

Comments

Your Answer

Question Tools

Stats

Related questions

Generate Maximal subsets based on mutual/subset property savecancel

2 Answers

Comments

Comments

Your Answer

Question Tools

Stats

Related questions

Generate Maximal subsets based on mutual/subset property