Pandas Dataframe: Uniformly Scale Down Values When Column Sum Exceeds Treshold
Initial Situation Consider the following example dataframe: df = pd.DataFrame({ 'A': [3., 2., 1., np.nan], 'B': [7., np.nan, 1., 3.], 'C': [4., 5., 1., 2.], 'D': [1
Solution 1:
Here is one method:
thres = 10result = df * thres / df.sum().clip(lower=thres)
Solution 2:
Here is another method:
colSums = df.sum()
df / ((colSums * (colSums > 10) / 10) + (colSums <= 10))
Here, we create a variable with the summed value for each column, then the denominator checks if the sum of the column exceeds 10 adjusts those colums so that they sum to ten. Those columns whose sums less then 10 are incremented to 1 so that we are not dividing by 0. The resulting array is then divided across the columns. This returns the desired result.
Out[46]:
ABCD03.06.3636363.3333331.012.0NaN4.1666670.021.00.9090910.8333332.03NaN2.7272731.6666673.0
Post a Comment for "Pandas Dataframe: Uniformly Scale Down Values When Column Sum Exceeds Treshold"