Count How Many Rows Have Date Within Date Range Of Each Row For Each Id Pandas
I have a dataset where every row has a date range and an ID value. I want to know for each row, how many other rows (that have the same ID) have a date1 within the date range of th
Solution 1:
.merge
the DataFrame
with itself on ID
. Then compare if the date
you brought over is between the two dates, excluding rows that merged with themselves.
import pandas as pd
m = test1.reset_index().merge(test1[['ID', 'date1']].reset_index(), on='ID')
# index_x ID date1_x date2 index_y date1_y#0 0 acb 2018-10-10 2019-01-24 0 2018-10-10#1 0 acb 2018-10-10 2019-01-24 22 2018-10-09#2 0 acb 2018-10-10 2019-01-24 47 2018-10-19#3 22 acb 2018-10-09 2019-03-01 0 2018-10-10#4 22 acb 2018-10-09 2019-03-01 22 2018-10-09
m['to_count'] = m.date1_y.ge(m.date1_x) & m.date1_y.le(m.date2) & (m.index_x != m.index_y)
m.groupby('index_x').to_count.sum()
#index_x#0 1.0#1 0.0#2 2.0#3 0.0# ... #97 1.0#98 3.0#99 1.0
Since this is based on the original index, you could assign it back with test1['other_date1_between'] = m.groupby('index_x').to_count.sum()
.
print(test1.sort_values('ID').head(5))IDdate1date2other_date1_between64aaa2018-07-21 2019-02-22 0.086aaa2018-02-05 2019-05-10 1.06aab2018-01-07 2019-04-09 1.042aab2018-10-03 2019-03-17 0.09aac2018-03-04 2019-02-24 0.0
Post a Comment for "Count How Many Rows Have Date Within Date Range Of Each Row For Each Id Pandas"