Drop Unordered Duplicates Across Separate Columns
I am trying to return a df where duplicate values have been removed. I have tried to use drop.duplicates() but the values in the columns which have been subset aren't ordered. As i
Solution 1:
You'll need to sort the columns along the horizontal axis, then get a mask to subset the original frame. Here's how you can use np.sort
and df.duplicated
to do that:
df[~pd.DataFrame(np.sort(df2[['Item_X', 'Item_Y']], axis=1)).duplicated()]
Item_X Item_Y Value
0 Foo Bar 12 Bot Foo 33 Bot Bot 44 Bar Bar 55 Foo Foo 6
Solution 2:
IIUC, use:
m=pd.DataFrame(np.sort(df[['Item_X','Item_Y']])).duplicated()
df[~m]
Item_X Item_Y Value
0 Foo Bar 1
2 Bot Foo 3
3 Bot Bot 4
4 Bar Bar 5
5 Foo Foo 6
Post a Comment for "Drop Unordered Duplicates Across Separate Columns"