# Making code faster by splitting and use of supercomputer

There John Palmieri has written a very good code. As the output is huge and hence time-taking(No output even for 4th order matrices with 4 summands even after many days), so I tried to split the code in parts according to the logic given in the following link:

and hence have this code:

def sum_list(length, total):
"""
INPUT:

- length -- how many terms
- total -- desired total

Return list of lists, each of which has length entries chosen
from {0, 1, -1} and adds up to total.
"""
C = IntegerListsLex(length=length, n=length+total, min_part=0, max_part=2)
return [[a-1 for a in L] for L in C]
def mat_list_NEW(mat, length):
"""
INPUT:

- mat -- matrix, assumed to have entries in {1, -1}
- length -- how many terms

Return list of lists, each of which has length entries and
adds up to mat, and each entry is either a (0, 1) or a (0,
-1) matrix.
"""
sum_plus = sum_list(length, 1)
sum_minus = sum_list(length, -1)
old_attempts = []
for x in mat.list():
if x == 1:
sums = sum_plus
elif x == -1:
sums = sum_minus
else:
raise ValueError('each entry of matrix should be 1 or -1')
if not old_attempts: # initial step
new_attempts = set(tuple([(s,) for s in S]) for S in sums)
else:
new_attempts = set()
for mats in old_attempts:
NEW = set()
for S in sums:
new_M = tuple(sorted(M + (s,) for (M, s) in zip(mats, S)))
# keep only entries that correspond to (0,1) or (0,-1) matrices
if all(max(entries) - min(entries) < 2 for entries in new_M):

new_attempts.update(NEW)
old_attempts = new_attempts
LOA=list(old_attempts)

P=len(LOA)

N=floor(P/5)

R = range(0,P , N)

c0 = LOA[R:R+N]
#c1 = LOA[R:R+N]
#c2 = LOA[R:R+N]
#c3 = LOA[R:R+N]
#c4 = LOA[R:R+N]
#c5 = LOA[R:R+N]

matrices = [[matrix(mat.base_ring(), mat.nrows(), mat.ncols(), entries) for entries in L] for L in c0]
return (matrices)

mat=matrix(3,3, [1, 1, 1, -1, 1, 1, -1, -1, 1])#Enter your matrix
mat_list_NEW(mat,4)#Specify in how many matrices you want to break it up


In the above link it mentions storing of c0,c1,c2,….in csv file because of memory limitations. I have no idea of csv files. So thought of writing each of ci’s individually (as I have written above in the code) and run codes separately. Kindly guide me how to use csv files to further help in running this code.

Note that in the above code for 3rd order matrix with 4 summands P is 73261 and for 5 summands P is 11740316. And we need for higher order matrices(upto say 30) and summands(upto say10) for which P will be very very huge. Somebody told me that if code is written in parallel, then it will run faster.

Also, I will be using supercomputing facility with the following details:

              OS: CentOS 7 Hardware Specifications
Type I: Total no. of compute nodes: 420
CPU only nodes: 259
CPU Architecture : Haswell
2x Intel Xeon E5-2680 v3/12-Core(24 cores each)
RAM: 64 GB
-----------------------------------
GPU accelerated nodes: 161
(CPU config same as above + GPU cards)
GPU Architecture : 2x NVIDIA K40 (each)
(12GB, 2880 CUDA cores)
-----------------------------------
High Memory Nodes : 512 GB RAM
12 CPU,  8 GPU nodes

TYPE II:Total no. of compute nodes: 184
CPU only nodes: 144
CPU Architecture : Skylake
2x Intel Xeon G-6148/20 cores(40 cores per node)
RAM: 96 GB
--------------------------------------------
GPU accelerated nodes: 40 (CPU conﬁg same )
GPU Architecture : NVIDIA V100
(32GB,5120 CUDA cores)
1 Nvidia V100 card each GPU node: 17
2 Nvidia V100 cards each GPU node:23
High Memory Nodes :  192 GB RAM
8 CPU, 40 GPU nodes


Requesting for resources will be clear with following examples:

 1.Request for 2 chunks with 20 cpus, 2 gpu cards for 2 hrs by mentioning as follows:
" -lselect=2: ncpus=20: ngpus=2 : -lwalltime=02:00:00"
2. Request for 2 chunks  i.e. full skylake nodes, 1gpu card for 3 hours by mentioning as follows:
" -lselect=2: ncpus=40: ngpus=1: centos=skylake: -l walltime=03:00:00"


Kindly guide me how to choose the resources viz. lselect, ncpus, ngpus for my code(Please note that the time limit is 168 hours per code)

So kindly guide with respect to the above mentioned points. Thanks.

edit retag close merge delete

1

It is very well possible that higher sizes will render the problem of generating all such solutions impractical even at a supercomputer. On the other hand, the structure of solutions here is relatively simple, and it may be possible to generate (some) solutions on demand or analyze them without generating all solutions. This brings the question - what is the context of this problem and what the solutions are used for? Do you really need them all generated at once?

Basically , I require those combinations which in addition to the conditions mentioned in link also satisfy that each matrix product AiAj is a linear combination of Ai's. Broken the code into two-1st part is the one in my link(now will be using the above code) whose output be used in 2nd part of the code which checks this last condition. As to generating some solutions on demand let me mention that in the link you may refer to my code in which I have specified the no. of non-negative/ non-positive summands , in which we can further specify no. of symmetric/non-symmetric matrices. No idea how to do in above code. As to your suggestion of analyzing without generating, please throw some light. No, I don't need to generate all at once, hence the splitting done in above code.

1

As I understand, the condition "that each matrix product AiAj is a linear combination of Ai's" is essential here and initially ignoring it results in combinatorial explosion. Rather than ignoring it and checking at later stages, it'd be more efficient to find a way generating only those solutions that satisfy this additional condition right away. I will take a look into this problem. Do you have any restriction on the coefficients of the linear combination - e.g., do they have to be integer or rational?

Yes, checking the last condition alongside should really help. Well,regarding the coefficients, they have to be integers. One more thing I want to say that in my question in the link I had not made it clear that the summand matrices have to be unique and that the matrices with all same entries (that is all zeroes, all ones and all minus ones matrices)are also not allowed.Actually in my code(in the link) these were satisfied so perhaps it slipped from my mind. Later when I pointed out, John Palmieri suggested some changes but it actually didn't work out(Please see the posts in the link). If such combinations are also removed then obviously it will save some more time. Thanks for looking into my problem.

Sort by » oldest newest most voted As I explained in the comments, it's unwise to ignore the condition that "each matrix product $A_iA_j$ is a linear combination of $A_i$'s with integer coefficients". If this condition is employed at earlier stages, a supercomputer may not be needed, and if it's ignored until later stages, a supercomputer may not be able to help.

In particular, this condition means that the lattice spanned by matrices $A_i$ must contain all powers of the given matrix $M$. It also follows that the number of matrices should be at least the degree of the minimal polynomial of $M$. In the given example, we have

M = matrix(3,3, [1, 1, 1, -1, 1, 1, -1, -1, 1])
k = M.minpoly().degree()


which gives $k=3$. So, the powers of $M$ span a subspace $V:=\langle M^0, M^1, M^2\rangle$. Since the span $\langle A_1, A_2, A_3, A_4\rangle$ contains $V$, we have all $A_i\in \langle M^0, M^1, M^2, A_1\rangle$. So, one can try to go over all 01-matrices $A_1$, and then find all 01-matrices in $\langle M^0, M^1, M^2, A_1\rangle$ to restrict the set of candidates for $A_2, A_3, A_4$.

more

Okay....I was not aware of the concepts you have mentioned. Thanks. I need to understand those concepts and proceed as you have suggested. Thanks again.