Ask Your Question
0

Can I parse CSV data separated by \t to vectors

asked 2013-08-10 04:46:38 +0100

Jaakko Seppälä gravatar image

updated 2013-08-10 10:14:39 +0100

I have a CSV data as follows:

pcb138  pcb180  pcb52   pcb118  pcb
1.46    0.738   0.532   0.72    19.9959
0.64    0.664   0.03    0.236   6.0996
3.29    1.15    0.134   1.54    24.9655
3.94    1.33    0.466   1.94    37.4436
3.18    2.14    0.243   1.47    30.183
2.43    1.3     0.137   1.31    20.8036
3.94    3.49    0.208   0.876   41.3818
3.38    1.04    0.477   2.46    29.478
2.21    0.966   0.457   1.14    24.2387
2.49    1.59    0.298   1.18    26.3198
0.86    0.395   0.02    0.406   8.591
3.38    1.85    0.539   1.5     36.4229
7.39    4.42    0.707   3.55    66.4108

Is it possible to read that to vectors? I tried the following:

sage: import csv                           
sage: file='/home/jaakko/Downloads/pcb.dat'
sage: reader=csv.reader(open(file))        
sage: L=[]                                 
sage: for row in reader:^J    L.append(row)
....:     
sage: L[0][0]
'pcb138\tpcb180\tpcb52\tpcb118\tpcb'
sage: L[1][0]
'1'
sage: L[1][1]
'46\t'

What I would like to have is vectors of the form

pcb138=vector([1.46,0.64,...,7.39])

without those tabulators.

edit retag flag offensive close merge delete

2 Answers

Sort by » oldest newest most voted
1

answered 2013-08-10 06:29:58 +0100

Luca gravatar image

Your data is not really cvs, since it is not comma separated. You could configure csv to properly parse it, but it is simpler to use native python. Here's a solution using a dictionary, instead of variables named pcb139, etc.

sage: data = map(lambda x: x.split('\t'), open(file).readlines())
sage: vects = {col[0] : vector(map(float, col[1:])) for col in zip(*data)}
sage: vects['pcb138']                                                     
(1.46, 0.64, 3.29, 3.94, 3.18, 2.43, 3.94, 3.38, 2.21, 2.49, 0.86, 3.38, 7.39)
edit flag offensive delete link more
3

answered 2013-08-10 09:35:44 +0100

tmonteil gravatar image

If you want to use the csv module, you can specify that the delimiter is the tabulator as follows:

sage: reader=csv.reader(open(file), delimiter='\t')

then,

sage: L=[]
sage: for row in reader:                   
....:     L.append(row)
sage: L
[['pcb138', 'pcb180', 'pcb52', 'pcb118', 'pcb'],
 ['1.46', '0.738', '0.532', '0.72', '19.9959'],
 ['0.64', '0.664', '0.03', '0.236', '6.0996'],
 ['3.29', '1.15', '0.134', '1.54', '24.9655'],
 ['3.94', '1.33', '0.466', '1.94', '37.4436'],
 ['3.18', '2.14', '0.243', '1.47', '30.183'],
 ['2.43', '1.3', '', '0.137', '1.31', '20.8036'],
 ['3.94', '3.49', '0.208', '0.876', '41.3818'],
 ['3.38', '1.04', '0.477', '2.46', '29.478'],
 ['2.21', '0.966', '0.457', '1.14', '24.2387'],
 ['2.49', '1.59', '0.298', '1.18', '26.3198'],
 ['0.86', '0.395', '0.02', '0.406', '8.591'],
 ['3.38', '1.85', '0.539', '1.5', '', '36.4229'],
 ['7.39', '4.42', '0.707', '3.55', '66.4108']]
sage: L[1][0]
'1.46'
edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

Stats

Asked: 2013-08-10 04:46:38 +0100

Seen: 2,034 times

Last updated: Aug 10 '13