Skip to content Skip to sidebar Skip to footer

Conditionally Aggregating Pandas Dataframe

I have a DataFrame that looks like: import pandas as pd df = pd.DataFrame([[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0], [9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 1

Solution 1:

IIUC, you can use expanding in modern pandas to handle this:

>>> cols = ["A","C","D","E"]>>> df[cols] * 2 + df[cols].expanding(axis=1).mean().shift(axis=1).fillna(0)

      A     C     D          E
0   2.0   7.0  10.0  12.666667
1  18.0  31.0  34.0  36.666667
2  34.0  55.0  58.0  60.666667

This reproduces your expected new columns (and has A become twice its original value, thanks to the fillna turning the NaNs to 0s).


We can see where this comes from step by step:

Starting from

>>> df[cols]ACDE01.03.04.05.019.011.012.013.0217.019.020.021.0

>>> df[cols].expanding(axis=1)
Expanding[min_periods=1,center=False,axis=1]

We can do sum first, because it's easier to check visually:

>>> df[cols].expanding(axis=1).sum()

      A     C     D     E
01.04.08.012.019.020.032.036.0217.036.056.060.0

>>> df[cols].expanding(axis=1).mean()

      A     C          D     E
01.02.02.6666674.019.010.010.66666712.0217.018.018.66666720.0

>>> df[cols].expanding(axis=1).mean().shift(axis=1)

    A     C     D          E
0 NaN   1.02.02.6666671 NaN   9.010.010.6666672 NaN  17.018.018.666667

>>> df[cols].expanding(axis=1).mean().shift(axis=1).fillna(0)

     A     C     D          E
00.01.02.02.66666710.09.010.010.66666720.017.018.018.666667

Post a Comment for "Conditionally Aggregating Pandas Dataframe"