# Divide Combinations(n,k) into multiple parts (n > 1110)

I asked a similar question here: https://ask.sagemath.org/question/453.... I closed it, because the answer matched the question there. However, now I have an extra question related to it:

C = Combinations(2110, 3)
C_cardinality = binomial(2110, 3) # 1.563.429.820
N = C_cardinality / 95 # Number of parts = 95 (chunk size)
R = range(0, C_cardinality , N)


Because C_cardinality is too big, I would like to divide C into multiple parts, which are then used on multiple computers to process separately. In order to be sure that these multiple parts are totally different but covering the whole C, I use C's list:

c0 = C.list()[R:R+N]
c1 = C.list()[R:R+N]
c2 = C.list()[R:R+N]
...


However, my computer's RAM (memory 128GB) is not enough to store the C's list. Therefore, I iterate C and store in a CSV instead:

def storeCSV(data, fo, wtype='a'):
with open(fo, wtype) as wFile:
write_file = csv.writer(wFile)
write_file.writerow([data])
index = 0
count = 0
for c in C:
storeCSV(c, "part_%s.csv" % index)
count += 1
if count == N:
count = 0
index += 1
print index


Unfortunately, this CSV write is too slow, i.e ~6000 write/min, which means I have to wait about 3800 hours for the whole C to finish.

You might wonder why I need to divide C into multiple parts. After having these parts, I will run the following code on different computers with a purpose to collect information among set of 3 numbers:

v = MatrixSpace(GF(2), 3,9)
g3 = (c for c in c0 if block_matrix(3, 1, [v[c], v[c], v[c]]).rank() >= 6) # or c1, c2, ..., cN
rel = dict()
for g in list(map(tuple,g3)):
# (1, 2, 3)
rel.setdefault(g[::2], []).append(g) # '[1, 3]': 2
rel.setdefault(g[:2], []).append(g[-1]) # '[1, 2]': 3
rel.setdefault(g[1:], []).append(g)  # '[2, 3]': 1


Specifically, I got all Combinations of 3 numbers within c0, c1,...,cN. Then, I filter what I need and store in g3. For all g3 of format (A, B, C), I collect (A,B), (B,C), (A,C) as keys and C,A,B as values respectively. The most important reason I must collect this in a Python dictionary is that I need to retrieve any rel[(X, Y)] later on.

I wish the problem is described clearly and someone can support. Thank a lot !!!

edit retag close merge delete