Skip to content Skip to sidebar Skip to footer

How To Replace Only Single Numbers With Another Number In A Pandas Dataframe?

I have the following pandas dataframe: date 0 1 1 2 2 23 3 31 4 4 ... n 3 How can I only replace all the numbers from 1 to 9 (e.g. numbers with one digit) with the

Solution 1:

If needed cast the column to str using astype(str), then call str.zfill to 0 pad those numbers:

In [13]:
df['date'] = df['date'].astype(str).str.zfill(2)
df

Out[13]:
  date
0   01
1   02
2   23
3   31
4   04

regarding your comment:

In [17]:
df['year'] = '20' + df['date']
df

Out[17]:
  date  year
0   01  2001
1   02  2002
2   23  2023
3   31  2031
4   04  2004

the above works when the column dtype is already str


Solution 2:

Use word boundaries:

Find: \b(\d)\b
Replace: 0$1


Solution 3:

Use a regex, something like

p = re.compile(r'\b\d\b')
p.sub(lambda x: '0'+x.group(), '0 1 2 23 34 5')
## result: '00 01 02 23 34 05'

Solution 4:

Try ^([0-9])$ for the pattern and 0\1 for the replacement:

>>> df = p.DataFrame(data={'date': ['1', '2', '12', '31']})
>>> df['date'].replace('^([0-9])$', r'0\1', regex=True)

0    01
1    02
2    12
3    31
Name: date, dtype: object

Reading the comments that you wrote on other questions, it seems like you are doing date formatting. I believe it's better to use datetime for this. Here's an example:

>>> from datetime import datetime
>>> df = p.DataFrame(data={'date': ['1', '2', '12', '31'], 'month': ['1', '2', '5', '12'], 'year': ['07', '10', '16', '17']})
>>> dates = df.apply(lambda row: datetime(year=2000+int(row['year']), month=int(row['month']), day=int(row['date'])), axis=1)
>>> dates

0   2007-01-01
1   2010-02-02
2   2016-05-12
3   2017-12-31
dtype: datetime64[ns]
>>> dates.apply(lambda row: row.strftime('%x'))

0    01/01/07
1    02/02/10
2    05/12/16
3    12/31/17
dtype: object
>>> dates.apply(lambda row: row.strftime('%Y-%m-%d'))

0    2007-01-01
1    2010-02-02
2    2016-05-12
3    2017-12-31
dtype: object

This way, you get better control over the date format.

Edit

If you need even more control over the conversion, make a function instead:

>>> def convert_dates(row):
...     year = row['year']
...     month = row['month']
...     day = row['date']
...     if '' in [year, month, day]:
...         return None # Don't bother with empty values 
...     year, month, day = [int(x) for x in [year, month, day]]
...     if year < 100:
...         year += 2000
...     return datetime(year, month, day)
... 
>>> df = p.DataFrame(data={'date': ['11', '2', '1', '31'], 'month': ['08', '2', '5', '12'], 'year': ['1985', '10', '16', '']})
>>> df.apply(convert_dates, axis=1)

0   1985-08-11
1   2010-02-02
2   2016-05-01
3          NaT
dtype: datetime64[ns]

Post a Comment for "How To Replace Only Single Numbers With Another Number In A Pandas Dataframe?"