How To Replace Only Single Numbers With Another Number In A Pandas Dataframe?
I have the following pandas dataframe: date 0 1 1 2 2 23 3 31 4 4 ... n 3 How can I only replace all the numbers from 1 to 9 (e.g. numbers with one digit) with the
Solution 1:
If needed cast the column to str
using astype(str)
, then call str.zfill
to 0 pad those numbers:
In [13]:
df['date'] = df['date'].astype(str).str.zfill(2)
df
Out[13]:
date
0 01
1 02
2 23
3 31
4 04
regarding your comment:
In [17]:
df['year'] = '20' + df['date']
df
Out[17]:
date year
0 01 2001
1 02 2002
2 23 2023
3 31 2031
4 04 2004
the above works when the column dtype is already str
Solution 2:
Use word boundaries:
Find: \b(\d)\b
Replace: 0$1
Solution 3:
Use a regex, something like
p = re.compile(r'\b\d\b')
p.sub(lambda x: '0'+x.group(), '0 1 2 23 34 5')
## result: '00 01 02 23 34 05'
Solution 4:
Try ^([0-9])$
for the pattern and 0\1
for the replacement:
>>> df = p.DataFrame(data={'date': ['1', '2', '12', '31']})
>>> df['date'].replace('^([0-9])$', r'0\1', regex=True)
0 01
1 02
2 12
3 31
Name: date, dtype: object
Reading the comments that you wrote on other questions, it seems like you are doing date formatting. I believe it's better to use datetime
for this. Here's an example:
>>> from datetime import datetime
>>> df = p.DataFrame(data={'date': ['1', '2', '12', '31'], 'month': ['1', '2', '5', '12'], 'year': ['07', '10', '16', '17']})
>>> dates = df.apply(lambda row: datetime(year=2000+int(row['year']), month=int(row['month']), day=int(row['date'])), axis=1)
>>> dates
0 2007-01-01
1 2010-02-02
2 2016-05-12
3 2017-12-31
dtype: datetime64[ns]
>>> dates.apply(lambda row: row.strftime('%x'))
0 01/01/07
1 02/02/10
2 05/12/16
3 12/31/17
dtype: object
>>> dates.apply(lambda row: row.strftime('%Y-%m-%d'))
0 2007-01-01
1 2010-02-02
2 2016-05-12
3 2017-12-31
dtype: object
This way, you get better control over the date format.
Edit
If you need even more control over the conversion, make a function instead:
>>> def convert_dates(row):
... year = row['year']
... month = row['month']
... day = row['date']
... if '' in [year, month, day]:
... return None # Don't bother with empty values
... year, month, day = [int(x) for x in [year, month, day]]
... if year < 100:
... year += 2000
... return datetime(year, month, day)
...
>>> df = p.DataFrame(data={'date': ['11', '2', '1', '31'], 'month': ['08', '2', '5', '12'], 'year': ['1985', '10', '16', '']})
>>> df.apply(convert_dates, axis=1)
0 1985-08-11
1 2010-02-02
2 2016-05-01
3 NaT
dtype: datetime64[ns]
Post a Comment for "How To Replace Only Single Numbers With Another Number In A Pandas Dataframe?"