Fast Way To Get The Number Of Nans In A Column Counted From The Last Valid Value In A Dataframe
Say I have a DataFrame like A B 0 0.1880 0.345 1 0.2510 0.585 2 NaN NaN 3 NaN NaN 4 NaN 1.150 5 0.2300 1.210 6 0.1670 1.290
Solution 1:
You can use:
a = df.isnull()
b = a.cumsum()
df1 = b.sub(b.mask(a).ffill().fillna(0).astype(int))
print (df1)
A B
0 0 0
1 0 0
2 1 1
3 2 2
4 3 0
5 0 0
6 0 0
7 0 0
8 0 1
9 0 2
10 1 3
11 2 4
12 3 5
For better understanding:
#add NaN where True in a
a2 = b.mask(a)
#forward filling NaN
a3 = b.mask(a).ffill()
#replace NaN to 0, cast to int
a4 = b.mask(a).ffill().fillna(0).astype(int)
#substract b to a4
a5 = b.sub(b.mask(a).ffill().fillna(0).astype(int))
df1 = pd.concat([a,b,a2, a3, a4, a5], axis=1,
keys=['a','b','where','ffill nan','substract','output'])
print (df1)
a b where ffill nan substract output
A B A B A B A B A B A B
0FalseFalse000.00.00.00.000001FalseFalse000.00.00.00.000002TrueTrue11 NaN NaN 0.00.000113TrueTrue22 NaN NaN 0.00.000224TrueFalse32 NaN 2.00.02.002305FalseFalse323.02.03.02.032006FalseFalse323.02.03.02.032007FalseFalse323.02.03.02.032008FalseTrue333.0 NaN 3.02.032019FalseTrue343.0 NaN 3.02.0320210TrueTrue45 NaN NaN 3.02.0321311TrueTrue56 NaN NaN 3.02.0322412TrueTrue67 NaN NaN 3.02.03235
Post a Comment for "Fast Way To Get The Number Of Nans In A Column Counted From The Last Valid Value In A Dataframe"