Skip to content Skip to sidebar Skip to footer

Replace A String Form One Column In Another Column

Is it possible to replace strings from one column with corresponding strings from another columns in a pandas dataframe using only the pandas.Series.str methods? 'No' is an accepta

Solution 1:

str.replace is a Series method, so it can be applied to each element of particular column, but there is not possible to refer to any other column.

So you have to import re and use re.sub instead, within a function applied to each row (so that this function can refer to other columns of the current row).

Your task can be performed in a single instruction:

df['replaced'] = df.apply(lambda row: re.sub(
    '^' + row.names + r'\s*', '', row.hobbies), axis=1)

This solution runs quicker than to create a Series with for loop inside and substitute under a column afterwards, because apply takes care of looping over the DataFrame, so the function applied is responsible only for generation of a value to be put in the current row.

An important factor concerning execution speed is also that you avoid location of the current row by index, each time in the loop.

Not also that your code would fail if index was other than consecutive numbers starting from 0. Try e.g. to create your DataFrame with index=np.arange(1, 5) parameter.

Solution 2:

The apply with replace will do the job here

df.apply(lambda x: x['hobbies'].replace(x['names'],''),axis=1)

It takes every row of data frame and replace the 'names' in 'hobbies' with empty string

Post a Comment for "Replace A String Form One Column In Another Column"