Sort Within Group Without Changing Group Order?
Solution 1:
You could create a new temporary column that transforms B, A and C to 1, 2 and 3, so that you maintain order of the unordered. Then, just drop the temporary column. In Answer #1, this is more dynamic and will work if the group column values are not grouped together consecutively. For Answer #2, they must be consecutive (the inputs for answer #1 and answer #2 are ordered differently)
Updated Answer #1 (per comment - the groups are not consecutive in rows, but we still want to order them correctly by the order of appearance of the first value within each group.) The code [l for l in enumerate((df['group'].unique()))] will assign a number to each group depending on the order of the first value of the group column in the dataframe.
In[1]:
name group revenue
0 Name1 GroupB 13 Name4 GroupA 44 Name5 GroupA 58 Name7 GroupC 91 Name2 GroupB 22 Name3 GroupB 35 Name6 GroupA 66 Name7 GroupC 77 Name7 GroupC 8
dft = pd.DataFrame([l for l in enumerate((df['group'].unique()))], columns=['group_number','group'])
df = pd.merge(df, dft, how='left', on='group').sort_values(['group_number', 'revenue'], ascending = [True, False])
df
Out[1]:
name group revenue group_number
5 Name3 GroupB 304 Name2 GroupB 200 Name1 GroupB 106 Name6 GroupA 612 Name5 GroupA 511 Name4 GroupA 413 Name7 GroupC 928 Name7 GroupC 827 Name7 GroupC 72I wanted to highlight the output of dft of the enumerate line of code before the merge and sort.
dft = pd.DataFrame([l for l in enumerate((df['group'].unique()))], columns=['group_number','group'])
dft
Out[2]:
group_number group00 GroupB
11 GroupA
22 GroupC
Answer #2
import pandas as pd
df = pd.DataFrame({'name': ['Name1','Name2','Name3','Name4','Name5','Name6', 'Name7', 'Name7', 'Name7'],
'group':['GroupB','GroupB','GroupB','GroupA','GroupA','GroupA','GroupC','GroupC','GroupC'],'revenue':[1,2,3,4,5,6,7,8,9]})
df['cs'] = (df['group'] != df['group'].shift(1)).cumsum()
df = df.sort_values(['cs', 'revenue'], ascending = [True, False])
df
Out[11]:
name group revenue cs
2 Name3 GroupB 3 1
1 Name2 GroupB 2 1
0 Name1 GroupB 1 1
5 Name6 GroupA 6 2
4 Name5 GroupA 5 2
3 Name4 GroupA 4 2
8 Name7 GroupC 9 3
7 Name7 GroupC 8 3
6 Name7 GroupC 7 3
For both answers, then just drop the column:
df = df.drop('cs', axis=1)
Out[12]:
name group revenue
2 Name3 GroupB 31 Name2 GroupB 20 Name1 GroupB 15 Name6 GroupA 64 Name5 GroupA 53 Name4 GroupA 48 Name7 GroupC 97 Name7 GroupC 86 Name7 GroupC 7Solution 2:
Why use groupby at all? You could just chain together multiple sort_values calls to get the correct sort order. e.g. using similar data to linked question and you wanted to sort by revenue descending but maintain groups ascending you could do:
import pandas as pd
df = pd.DataFrame({'name': ['Name1','Name2','Name3','Name4','Name5','Name6', 'Name7', 'Name7', 'Name7'],
'group':['GroupB','GroupB','GroupB','GroupA','GroupA','GroupA','GroupC','GroupC','GroupC'],'revenue':[1,2,3,4,5,6,7,8,9]})
df.sort_values(by='revenue', ascending= False).sort_values(by='group')
Which would return:
name group revenue
5 Name6 GroupA 64 Name5 GroupA 53 Name4 GroupA 42 Name3 GroupB 31 Name2 GroupB 20 Name1 GroupB 18 Name7 GroupC 97 Name7 GroupC 86 Name7 GroupC 7
Post a Comment for "Sort Within Group Without Changing Group Order?"