Groupby Issues Of Not Recognizing Numeric Column Pandas Python
I have an excel data that i read in by pd.read_excel: Block Concentration Name Replicate 1 Array Marker 1 Array
Solution 1:
Instead of function cumcount()+1
can be used rolling count with moving window=3
:
#groupby andset rolling count fromcolumn Block
data["Replicate"] = data.groupby(["Block", "Name"])["Block"].transform(pd.rolling_count, window=3)
Formatting is very strange. If it isn't problem with copy data to question, you can repair it by casting column Concentration
to float and striping white-spaces in column Name
from start and end of text.
BlockConcentrationNameReplicate1ArrayMarker1ArrayMarker1100.0Man5GlcNAc2
133.0Man5GlcNAc2
110.0Man5GlcNAc2
1100.0Man6GlcNAc2
133.0Man6GlcNAc2
110.0Man6GlcNAc2
1100.0Man7GlcNAc2 D1133.0Man7GlcNAc2 D1110.0Man7GlcNAc2 D11100.0Man7GlcNAc2 D3133.0Man7GlcNAc2 D3110.0Man7GlcNAc2 D3
#convertcolumn Concentration tofloat
data['Concentration'] = data['Concentration'].astype(float)
#strip firstandlast whitespaces
data['Name'] = data['Name'].str.strip()
#groupby andset rolling count fromcolumn Block
data["Replicate"] = data.groupby(["Block", "Name"])["Block"].transform(pd.rolling_count, window=3)
BlockConcentrationNameReplicate01ArrayMarker111ArrayMarker221100Man5GlcNAc2 13133Man5GlcNAc2 24110Man5GlcNAc2 351100Man6GlcNAc2 16133Man6GlcNAc2 27110Man6GlcNAc2 381100Man7GlcNAc2 D119133Man7GlcNAc2 D1210110Man7GlcNAc2 D13111100Man7GlcNAc2 D3112133Man7GlcNAc2 D3213110Man7GlcNAc2 D33
Solution 2:
If you remove 'Concentration' from your group you will get the expected output.
data["Replicate"] = data.groupby(["Block", "Name"]).cumcount()+1
>>> data
Block Concentration Name Replicate
01'' Array.Marker 111'' Array.Marker 221100.0 Man5GlcNAc2 13133.0 Man5GlcNAc2 24110.0 Man5GlcNAc2 351100.0 Man6GlcNAc2 16133.0 Man6GlcNAc2 27110.0 Man6GlcNAc2 381100.0 Man7GlcNAc2D1 19133.0 Man7GlcNAc2D1 2
Post a Comment for "Groupby Issues Of Not Recognizing Numeric Column Pandas Python"