First time here? Check out the FAQ!

Ask Your Question
0

Can I parse CSV data separated by \t to vectors

asked 11 years ago

Jaakko Seppälä gravatar image

updated 11 years ago

I have a CSV data as follows:

pcb138  pcb180  pcb52   pcb118  pcb
1.46    0.738   0.532   0.72    19.9959
0.64    0.664   0.03    0.236   6.0996
3.29    1.15    0.134   1.54    24.9655
3.94    1.33    0.466   1.94    37.4436
3.18    2.14    0.243   1.47    30.183
2.43    1.3     0.137   1.31    20.8036
3.94    3.49    0.208   0.876   41.3818
3.38    1.04    0.477   2.46    29.478
2.21    0.966   0.457   1.14    24.2387
2.49    1.59    0.298   1.18    26.3198
0.86    0.395   0.02    0.406   8.591
3.38    1.85    0.539   1.5     36.4229
7.39    4.42    0.707   3.55    66.4108

Is it possible to read that to vectors? I tried the following:

sage: import csv                           
sage: file='/home/jaakko/Downloads/pcb.dat'
sage: reader=csv.reader(open(file))        
sage: L=[]                                 
sage: for row in reader:^J    L.append(row)
....:     
sage: L[0][0]
'pcb138\tpcb180\tpcb52\tpcb118\tpcb'
sage: L[1][0]
'1'
sage: L[1][1]
'46\t'

What I would like to have is vectors of the form

pcb138=vector([1.46,0.64,...,7.39])

without those tabulators.

Preview: (hide)

2 Answers

Sort by » oldest newest most voted
1

answered 11 years ago

Luca gravatar image

Your data is not really cvs, since it is not comma separated. You could configure csv to properly parse it, but it is simpler to use native python. Here's a solution using a dictionary, instead of variables named pcb139, etc.

sage: data = map(lambda x: x.split('\t'), open(file).readlines())
sage: vects = {col[0] : vector(map(float, col[1:])) for col in zip(*data)}
sage: vects['pcb138']                                                     
(1.46, 0.64, 3.29, 3.94, 3.18, 2.43, 3.94, 3.38, 2.21, 2.49, 0.86, 3.38, 7.39)
Preview: (hide)
link
3

answered 11 years ago

tmonteil gravatar image

If you want to use the csv module, you can specify that the delimiter is the tabulator as follows:

sage: reader=csv.reader(open(file), delimiter='\t')

then,

sage: L=[]
sage: for row in reader:                   
....:     L.append(row)
sage: L
[['pcb138', 'pcb180', 'pcb52', 'pcb118', 'pcb'],
 ['1.46', '0.738', '0.532', '0.72', '19.9959'],
 ['0.64', '0.664', '0.03', '0.236', '6.0996'],
 ['3.29', '1.15', '0.134', '1.54', '24.9655'],
 ['3.94', '1.33', '0.466', '1.94', '37.4436'],
 ['3.18', '2.14', '0.243', '1.47', '30.183'],
 ['2.43', '1.3', '', '0.137', '1.31', '20.8036'],
 ['3.94', '3.49', '0.208', '0.876', '41.3818'],
 ['3.38', '1.04', '0.477', '2.46', '29.478'],
 ['2.21', '0.966', '0.457', '1.14', '24.2387'],
 ['2.49', '1.59', '0.298', '1.18', '26.3198'],
 ['0.86', '0.395', '0.02', '0.406', '8.591'],
 ['3.38', '1.85', '0.539', '1.5', '', '36.4229'],
 ['7.39', '4.42', '0.707', '3.55', '66.4108']]
sage: L[1][0]
'1.46'
Preview: (hide)
link

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

Stats

Asked: 11 years ago

Seen: 2,085 times

Last updated: Aug 10 '13