Fast Way To Transpose And Concat Csv Files In Python?
I am trying to transpose multiple files of the same format and concatinating them into 1 big CSV file. I wanted to use numpy for transposing as its a really fast way of doing it bu
Solution 1:
There are various ways of handling headers in genfromtxt
. The default is to treat them as part of the data:
In [6]: txt="""time,topic1,topic2,country
...: 2015-10-01,20,30,usa
...: 2015-10-02,25,35,usa"""
In [7]: data=np.genfromtxt(txt.splitlines(),delimiter=',',skip_header=0)
In [8]: data
Out[8]:
array([[ nan, nan, nan, nan],
[ nan, 20., 30., nan],
[ nan, 25., 35., nan]])
But since the default dtype is float, the strings all appear as nan
.
You can treat them as headers - the result is a structured array. The headers now appear in the data.dtype.names
list.
In [9]: data=np.genfromtxt(txt.splitlines(),delimiter=',',names=True)
In [10]: data
Out[10]:
array([(nan, 20.0, 30.0, nan), (nan, 25.0, 35.0, nan)],
dtype=[('time', '<f8'), ('topic1', '<f8'), ('topic2', '<f8'), ('country', '<f8')])
With dtype=None
, you let it choose the dtype. Based on the strings in the 1st line, it loads everything as S10
.
In [11]: data=np.genfromtxt(txt.splitlines(),delimiter=',',dtype=None)
In [12]: data
Out[12]:
array([['time', 'topic1', 'topic2', 'country'],
['2015-10-01', '20', '30', 'usa'],
['2015-10-02', '25', '35', 'usa']],
dtype='|S10')
This matrix can be transposed, and printed or written to a csv file:
In [13]: data.T
Out[13]:
array([['time', '2015-10-01', '2015-10-02'],
['topic1', '20', '25'],
['topic2', '30', '35'],
['country', 'usa', 'usa']],
dtype='|S10')
Since I'm using genfromtxt
to load, I could use savetxt
to save:
In [26]: with open('test.txt','w') as f:
np.savetxt(f, data.T, delimiter=',', fmt='%12s')
np.savetxt(f, data.T, delimiter=';', fmt='%10s') # simulate a 2nd array
....:
In [27]: cat test.txt
time, 2015-10-01, 2015-10-02
topic1, 20, 25
topic2, 30, 35
country, usa, usa
time;2015-10-01;2015-10-02
topic1; 20; 25
topic2; 30; 35
country; usa; usa
Post a Comment for "Fast Way To Transpose And Concat Csv Files In Python?"