Skip to content Skip to sidebar Skip to footer

Count How Many Rows Have Date Within Date Range Of Each Row For Each Id Pandas

I have a dataset where every row has a date range and an ID value. I want to know for each row, how many other rows (that have the same ID) have a date1 within the date range of th

Solution 1:

.merge the DataFrame with itself on ID. Then compare if the date you brought over is between the two dates, excluding rows that merged with themselves.

import pandas as pd

m = test1.reset_index().merge(test1[['ID', 'date1']].reset_index(), on='ID')
#   index_x   ID    date1_x      date2  index_y    date1_y#0        0  acb 2018-10-10 2019-01-24        0 2018-10-10#1        0  acb 2018-10-10 2019-01-24       22 2018-10-09#2        0  acb 2018-10-10 2019-01-24       47 2018-10-19#3       22  acb 2018-10-09 2019-03-01        0 2018-10-10#4       22  acb 2018-10-09 2019-03-01       22 2018-10-09

m['to_count'] = m.date1_y.ge(m.date1_x) & m.date1_y.le(m.date2) & (m.index_x != m.index_y)
m.groupby('index_x').to_count.sum()
#index_x#0     1.0#1     0.0#2     2.0#3     0.0#     ... #97    1.0#98    3.0#99    1.0

Since this is based on the original index, you could assign it back with test1['other_date1_between'] = m.groupby('index_x').to_count.sum().

print(test1.sort_values('ID').head(5))IDdate1date2other_date1_between64aaa2018-07-21 2019-02-22                  0.086aaa2018-02-05 2019-05-10                  1.06aab2018-01-07 2019-04-09                  1.042aab2018-10-03 2019-03-17                  0.09aac2018-03-04 2019-02-24                  0.0

Post a Comment for "Count How Many Rows Have Date Within Date Range Of Each Row For Each Id Pandas"