Skip to content Skip to sidebar Skip to footer

How To Groupby And Pivot A Dataframe With Non-numeric Values

I'm using Python, and I have a dataset of 6 columns, R, Rc, J, T, Ca and Cb. I need to 'aggregate' on the columns 'R' then 'J', so that for each R, each row is a unique 'J'. Rc is

Solution 1:

You can use pivot_table (here the docs) with a lambda function as aggfunc argument:

table= pd.pivot_table(df, index = ['R','Rc','J'],values= ['Ca','Cb'],
                    columns = ['T'], fill_value ='', aggfunc = lambda x: ''.join(str(v) for v in x)).reset_index()


   R Rc  J Ca       Cb      
T           1231230  a  p  1  x  y  z  d  e  f
1  b  o  1  w        g      
2  b  o  2  v        h      
3  b  o  3  s        i      
4  c  n  1  t  r     j  k   
5  c  n  2  u        l      

Then you can remove the multiindex columns and rename as follow (taken from this great answer):

table.columns = ['%s%s' % (a, ' (T = %s)' % b if b else'') for a, b intable.columns]

   R Rc  J Ca (T = 1) Ca (T = 2) Ca (T = 3) Cb (T = 1) Cb (T = 2) Cb (T = 3)
0  a  p  1          x          y          z          d          e          f
1  b  o  1          w                                g                      
2  b  o  2          v                                h                      
3  b  o  3          s                                i                      
4  c  n  1          t          r                     j          k           
5  c  n  2          u                                l                      

Solution 2:

If I understand what you need, you can simply locate the needed rows like this:

df['Ca(T=1)']=df['Ca'].loc[df['T']==1]

you have to repeat it for the different T's

Post a Comment for "How To Groupby And Pivot A Dataframe With Non-numeric Values"