Read a very large file

list
file

asked 2022-04-23 15:26:01 +0200

lijr07
255 ●6 ●13 ●32

I have a very large file and I read it as follows:

with open('/Users/jianrongli/Downloads/SmallRank4ModulesGr416.txt', 'r') as fp:
    L = [sage_eval(line) for line in fp.readlines() if line.strip()]

The file is very large (more than 3 G). Each element in L is a 2d array like [[1,2,3,5],[2,3,4,9]]. Now I want to select those in L whose max number is less than 12.

r1=[]
for j in b2:
    t1=[]
    for i in j:
        #print(t1,list(i))
        t1=t1+list(i)
    if max(t1)<=12:
        r1.append(j)
len(r1)

Since the file is very large, it takes a few hours and has not finished. Is there some way to make it faster? Thank you very much.

edit retag flag offensive close merge delete

Comments

Your condition can be terser : ...if max(flatten(j))<=12 should do the job.

But your main problem is that you have to read the whole b2 in memory before even starting to process it. Can you somehow (sed comes in mind...) restructure your input file in order to have exactly one element of b2 on each line ? If so, your program could simply do something along the lines of :

fp=open('Your/file/name.txt')
L=[]
l=fp.readline()
while l :
    j=eval(l)
    if max(flatten(j))<=12: L.append(j)
    l=fp.readline()
fp.close()

HTH,

Emmanuel Charpentier ( 2022-04-23 17:04:47 +0200 )edit

@Emmanuel, thank you very much!

lijr07 ( 2022-04-23 18:04:19 +0200 )edit

Does it work for you ?

Emmanuel Charpentier ( 2022-04-23 19:49:20 +0200 )edit

@Emmanuel, yes, the speed is increased. Thank you very much.

lijr07 ( 2022-04-23 21:49:34 +0200 )edit

OK. I'll transcribe that as an answer, for sake of future users with a similar question. Feel free to accept it.

Emmanuel Charpentier ( 2022-04-24 08:21:11 +0200 )edit

add a comment

Read a very large file

Comments

1 Answer

Your Answer

Question Tools

Stats

Related questions

Read a very large file edit

Comments

1 Answer

Your Answer

Question Tools

Stats

Related questions

Read a very large file