Skip to content Skip to sidebar Skip to footer

Pandas Inconsistent Date-time Format

I started using pandas library about a fortnight back. Learning the new features. I would appreciate help on the following problem. I have a column with dates in mixed format. Thes

Solution 1:

The real problem is that there are ambiguous dates in your dataset (do you parse it as mm/dd/yyyy or dd/mm/yyyy if it could be either?? (I've been here, and we decided just to pick what the majority seemed to be; essentially the dataset was compromised... and we had to treat it as such).


If it's a Series then hitting it with pd.to_datetime seems to work:

In [11]: s = pd.Series(['6/5/2016', '7/5/2016', '7/5/2016', '7/5/2016', '9/5/2016', '9/5/2016', '9/5/2016', '9/5/2016', '5/13/2016', '5/14/2016', '5/14/2016'])

In [12]: pd.to_datetime(s)
Out[12]:
02016-06-0512016-07-0522016-07-0532016-07-0542016-09-0552016-09-0562016-09-0572016-09-0582016-05-1392016-05-14102016-05-14
Name: 0, dtype: datetime64[ns]

Note: If you had a consistent format you can pass it in explicitly:

In [13]:pd.to_datetime(s,format="%m/%d/%Y")Out[13]:02016-06-0512016-07-0522016-07-0532016-07-0542016-09-0552016-09-0562016-09-0572016-09-0582016-05-1392016-05-14102016-05-14Name:0,dtype:datetime64[ns]

Post a Comment for "Pandas Inconsistent Date-time Format"