Skip to content Skip to sidebar Skip to footer

How To Get Count Of Column Values For Each Unique Pair Of Columns In Pandas?

I have a data that is given below data = [(datetime.datetime(2020, 12, 21, 6, 50, 14, 955551), 'blr', 'del', 'medium'), (datetime.datetime(2020, 12, 21, 7, 6, 0, 242578), 'lon', 'd

Solution 1:

Use GroupBy.size with spcify columns for test:

s1 = df[df.values == 'medium'].groupby(['start','end']).size()
print (s1)
start  end
blr    del1
lon    del9
ny     del1
dtype: int64

Or if want all combinations also with type:

print(df.groupby(['type','start','end']).size())
type    start  end
low     lon    del3
        ny     del2
medium  blr    del1
        lon    del9
        ny     del1
dtype: int64


print (s.loc['medium'])
start  end
blr    del1
lon    del9
ny     del1
dtype: int64


print (s.loc['low'])
start  end
lon    del3
ny     del2
dtype: int64

Solution 2:

Use value_counts:

res = df[df['type'].eq('medium')].value_counts()
print(res)

Output

start  end  type  
lon    del  medium    9
ny     del  medium    1
blr    del  medium    1
dtype: int64

From the documentation:

Return a Series containing counts of unique rows in the DataFrame.

If you want to remove the type from the output, use droplevel, as suggested by @jezrael:

res = df[df['type'].eq('medium')].value_counts().droplevel(level=-1)
print(res)

Output

start  end
lon    del9
ny     del1
blr    del1
dtype: int64

This can also be extended for all types, for example, using:

res = df.value_counts(subset=['type', 'start', 'end']).sort_index(level=0)
print(res)

Output

type    start  end
low     lon    del3
        ny     del2
medium  blr    del1
        lon    del9
        ny     del1
dtype: int64

Solution 3:

df.where(lambda x:x.type == "medium").dropna().groupby(['start', 'end']).type.agg("count")
start  end
blr    del1
lon    del9
ny     del1
Name: type, dtype: int64

Post a Comment for "How To Get Count Of Column Values For Each Unique Pair Of Columns In Pandas?"