Read a very large file
I have a very large file and I read it as follows:
with open('/Users/jianrongli/Downloads/SmallRank4ModulesGr416.txt', 'r') as fp:
L = [sage_eval(line) for line in fp.readlines() if line.strip()]
The file is very large (more than 3 G). Each element in L is a 2d array like [[1,2,3,5],[2,3,4,9]]. Now I want to select those in L whose max number is less than 12.
r1=[]
for j in b2:
t1=[]
for i in j:
#print(t1,list(i))
t1=t1+list(i)
if max(t1)<=12:
r1.append(j)
len(r1)
Since the file is very large, it takes a few hours and has not finished. Is there some way to make it faster? Thank you very much.
Your condition can be terser : ...
if max(flatten(j))<=12
should do the job.But your main problem is that you have to read the whole
b2
in memory before even starting to process it. Can you somehow (sed
comes in mind...) restructure your input file in order to have exactly one element ofb2
on each line ? If so, your program could simply do something along the lines of :HTH,
@Emmanuel, thank you very much!
Does it work for you ?
@Emmanuel, yes, the speed is increased. Thank you very much.
OK. I'll transcribe that as an answer, for sake of future users with a similar question. Feel free to accept it.