Pandas Inconsistent Date-time Format
I started using pandas library about a fortnight back. Learning the new features. I would appreciate help on the following problem. I have a column with dates in mixed format. Thes
Solution 1:
The real problem is that there are ambiguous dates in your dataset (do you parse it as mm/dd/yyyy or dd/mm/yyyy if it could be either?? (I've been here, and we decided just to pick what the majority seemed to be; essentially the dataset was compromised... and we had to treat it as such).
If it's a Series then hitting it with pd.to_datetime
seems to work:
In [11]: s = pd.Series(['6/5/2016', '7/5/2016', '7/5/2016', '7/5/2016', '9/5/2016', '9/5/2016', '9/5/2016', '9/5/2016', '5/13/2016', '5/14/2016', '5/14/2016'])
In [12]: pd.to_datetime(s)
Out[12]:
02016-06-0512016-07-0522016-07-0532016-07-0542016-09-0552016-09-0562016-09-0572016-09-0582016-05-1392016-05-14102016-05-14
Name: 0, dtype: datetime64[ns]
Note: If you had a consistent format you can pass it in explicitly:
In [13]:pd.to_datetime(s,format="%m/%d/%Y")Out[13]:02016-06-0512016-07-0522016-07-0532016-07-0542016-09-0552016-09-0562016-09-0572016-09-0582016-05-1392016-05-14102016-05-14Name:0,dtype:datetime64[ns]
Post a Comment for "Pandas Inconsistent Date-time Format"