Datetime Issues While Time Series Predicting In Pandas
Trying to implement the model of time series predicting in python but facing with issues with datetime data. So I have a dataframe 'df' with two columns of datetime and float types
Solution 1:
It's complicated.
First of all, when creating a numpy
array, all types will be the same. However, datetime64
is not the same as int
. So we'll have to resolve that, and we will.
Second, you tried to do this with df.values
. Which makes sense, however, what happens is that pandas
makes the whole df
into dtype=object
then into an object
array. The problem with that is that Timestamps
get left as Timestamps
which is getting in your way.
So I'd convert them on my own like this
a = np.column_stack([df[c].values.astype(int) for c in ['transaction_date', 'amount']])
a
array([[1454284800000000000, 1],
[1454371200000000000, 2],
[1454457600000000000, 3],
[1454544000000000000, 4],
[1454630400000000000, 5]])
We can always convert the first column of a back like this
a[:, 0].astype(df.transaction_date.values.dtype)
array(['2016-02-01T00:00:00.000000000', '2016-02-02T00:00:00.000000000',
'2016-02-03T00:00:00.000000000', '2016-02-04T00:00:00.000000000',
'2016-02-05T00:00:00.000000000'], dtype='datetime64[ns]')
Solution 2:
you can convert your integer into a timedelta
, and do the calculations as you did before:
from datetime import timedelta
interval = timedelta(days = 5)
#5 days later
time_stamp += interval
Post a Comment for "Datetime Issues While Time Series Predicting In Pandas"