Skip to content Skip to sidebar Skip to footer

How To Reduce A Function That Is Increasing Constantly In Python?

I have a dataframe as: big fc15 fc16 fc17 fc18 fc19 ... fc23 fc24 fc25 fc26 fc27 fc28 28 2018-10-01 2019-02-01 2

Solution 1:

It is a draft of a Solution because I'm not exacttly sure of couple issues: how many columns you want to include in calculating maximums[changeble amount with minimum column_number], what is maximum and minimum values of df.big, how to handle max(big) [if before minuent assigment].

But overall it is scalable to arbitrary number of columns.

About groupby.apply: it executes only for every group (rows with the same number big), not for each row.

Setup:

import pandas as pd
import numpy as np
from datetime import datetime as dt

np.random.seed(1234)

defpp_rounded(start, end, n):
    start_u = start.value//10**9
    end_u = end.value//10**9
    output = pd.Series(
        pd.DatetimeIndex(
            (10**9*np.random.randint(start_u, end_u, n, dtype=np.int64)
            ).view('M8[ns]')
        )
    ).dt.floor("1D")
    return output

start = pd.to_datetime('2010-01-01')
end = pd.to_datetime('2020-01-01')
rows_num = 100_000

biglist = np.random.randint(16, 30, rows_num)
df = pd.DataFrame(biglist, columns=["big"])
for i inrange(15, 29):
    df[f"fc{i}"] = pp_rounded(start, end, rows_num)

big_max = df.big.max()
col_min = 15

Main functions:

defpick_col_names(big_grp):
    global col_min, big_max 
    number_columns_to_include = 6 
    start = min(big_max, big_grp) - 1
    end = max(col_min, start - number_columns_to_include) - 1 
    column_names = [f'fc{j}'for j inrange(start, end, -1)]
    return column_names

defcalc_time_without_offer(grp, big_max):
    big = grp.big.iloc[0]
    column_names = pick_col_names(big)
    
    minuend = Noneif big >= big_max:
        minuend = pd.to_datetime(dt.today().date())
    else:
        minuend = grp[f"fc{big}"]
        
    grp['time_without_offer'] = minuend - grp[column_names].max(axis=1)
    return grp

Usage:

df = df.groupby("big").apply(calc_time_without_offer, big_max)
df["time_without_offer"]

Outputs:

0-1175days1-634days2-1233days3108days4-1546days...99995-1603days99996-457days999971965 days99998882days99999-2390daysName:time_without_offer,Length:100000,dtype:timedelta64[ns]

Post a Comment for "How To Reduce A Function That Is Increasing Constantly In Python?"