Loading [MathJax]/jax/output/HTML-CSS/jax.js

First time here? Check out the FAQ!

Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

More efficient methods for finding subsets of Subsets()

Calling Subsets([1..100000], 4) takes 50 ms to generate a list with about 4×1018 elements. %%prun suggests the majority of that time is used by misc.py:556(_stable_uniq)

(Note: trying this with >∼105 elements causes an overflow.)

But getting a subset of that list... hoo boy.

NList = [i for i in Subsets([1..100], 4) if sum(i) == 100] takes 45 s to generate a list with 5952 elements, out of 3.9×106 possibilities. About half of that time is used by set.py:465(__init__)

Note that the size of the main set has been decreased by a factor of 1000. That's because this list comprehension seems to run at about O(n4), so going from 100 to 1000 multiplies the time by around 104; that increases the time from ~45 s to 5... days. (Creating the list, then filtering it, doesn't seem to change the time much.)

So my question is, is there any faster way to filter through such a long list that I don't know of? I'd like to be working in the [1..2000] range, but I don't think my kernel will hold out for 80 days to make that happen. (Probably not even 5 days).

Any suggestions welcome!

More efficient methods for finding subsets of Subsets()

Calling SS = Subsets([1..100000], 4) takes 50 ms to generate a list with about 4×1018 elements. %%prun suggests the majority of that time is used by misc.py:556(_stable_uniq)

(Note: trying this with >∼105 elements causes an overflow.)

But getting a subset of that list... hoo boy.

NList = [i for i in Subsets([1..100], 4) if sum(i) == 100] takes 45 s to generate a list with 5952 elements, out of 3.9×106 possibilities. About half of that time is used by set.py:465(__init__)

Note that the size of the main set has been decreased by a factor of 1000. That's because this list comprehension seems to run at about O(n4), so going from 100 to 1000 multiplies the time by around 104; that increases the time from ~45 s to 5... days. (Creating the list, then filtering it, doesn't seem to change the time much.)

So my question is, is there any faster way to filter through such a long list that I don't know of? I'd like to be working in the [1..2000] range, but I don't think my kernel will hold out for 80 days to make that happen. (Probably not even 5 days).

Any suggestions welcome!