How To Groupby And Pivot A Dataframe With Non-numeric Values
I'm using Python, and I have a dataset of 6 columns, R, Rc, J, T, Ca and Cb. I need to 'aggregate' on the columns 'R' then 'J', so that for each R, each row is a unique 'J'. Rc is
Solution 1:
You can use pivot_table
(here the docs) with a lambda function as aggfunc
argument:
table= pd.pivot_table(df, index = ['R','Rc','J'],values= ['Ca','Cb'],
columns = ['T'], fill_value ='', aggfunc = lambda x: ''.join(str(v) for v in x)).reset_index()
R Rc J Ca Cb
T 1231230 a p 1 x y z d e f
1 b o 1 w g
2 b o 2 v h
3 b o 3 s i
4 c n 1 t r j k
5 c n 2 u l
Then you can remove the multiindex columns and rename as follow (taken from this great answer):
table.columns = ['%s%s' % (a, ' (T = %s)' % b if b else'') for a, b intable.columns]
R Rc J Ca (T = 1) Ca (T = 2) Ca (T = 3) Cb (T = 1) Cb (T = 2) Cb (T = 3)
0 a p 1 x y z d e f
1 b o 1 w g
2 b o 2 v h
3 b o 3 s i
4 c n 1 t r j k
5 c n 2 u l
Post a Comment for "How To Groupby And Pivot A Dataframe With Non-numeric Values"