Calculate Datetime-difference In Years, Months, Etc. In A New Pandas Dataframe Column
I have a pandas dataframe looking like this: Name start end A 2000-01-10 1970-04-29 I want to add a new column providing the difference between the start and end
Solution 1:
You can try by creating a new column with years in this way:
df['diff_year'] = df['diff'] / np.timedelta64(1, 'Y')
Solution 2:
Pretty much straightforward with relativedelta
:
from dateutil import relativedelta
>> end start
>>01970-04-292000-01-10for i in df.index:
df.at[i, 'diff'] = relativedelta.relativedelta(df.ix[i, 'start'], df.ix[i, 'end'])
>> end start diff
>>01970-04-292000-01-10 relativedelta(years=+29, months=+8, days=+12)
Solution 3:
A much simpler way is to use date_range function and calculate length of the same
startdt=pd.to_datetime('2017-01-01')
enddt = pd.to_datetime('2018-01-01')
len(pd.date_range(start=startdt,end=enddt,freq='M'))
Solution 4:
With a simple function you can reach your goal.
The function calculates the years difference and the months difference with a simple calculation.
import pandas as pd
import datetime
defparse_date(td):
resYear = float(td.days)/364.0# get the number of years including the the numbers after the dot
resMonth = int((resYear - int(resYear))*364/30) # get the number of months, by multiply the number after the dot by 364 and divide by 30.
resYear = int(resYear)
returnstr(resYear) + "Y" + str(resMonth) + "m"
df = pd.DataFrame([("2000-01-10", "1970-04-29")], columns=["start", "end"])
df["delta"] = [parse_date(datetime.datetime.strptime(start, '%Y-%m-%d') - datetime.datetime.strptime(end, '%Y-%m-%d')) for start, end inzip(df["start"], df["end"])]
print df
start end delta
02000-01-101970-04-29 29Y9m
Solution 5:
I think this is the most 'pandas' way to do it, without using any for loops or defining external functions:
>>>df = pd.DataFrame({'Name': ['A'], 'start': [datetime(2000, 1, 10)], 'end': [datetime(1970, 4, 29)]})>>>df['diff'] = map(lambda td: datetime(1, 1, 1) + td, list(df['start'] - df['end']))>>>df['diff'] = df['diff'].apply(lambda d: '{0}y{1}m'.format(d.year - 1, d.month - 1))>>>df
Name end start diff
0 A 1970-04-29 2000-01-10 29y8m
Had to use map instead of apply because of pandas' timedelda64, which doesn't allow a simple addition to a datetime object.
Post a Comment for "Calculate Datetime-difference In Years, Months, Etc. In A New Pandas Dataframe Column"