Skip to content Skip to sidebar Skip to footer

Concatenate Rows In A Dataframe

I have a dataframe structured like below: Column A Column B 1 A 1 B 1 C 1 D 2 B 2 C 2 D 2 E

Solution 1:

In R, we can use dplyr. After grouping by 'ColumnA', paste the contents of 'ColumnB' and create a new column with mutate

library(dplyr)
df1 %>%
     group_by(ColumnA) %>% 
     mutate(ColumnC = paste(ColumnB, collapse=""))
# A tibble: 8 x 3
# Groups:   ColumnA [2]
#  ColumnA ColumnB ColumnC
#    <int>   <chr>   <chr>
#1       1       A    ABCD
#2       1       B    ABCD
#3       1       C    ABCD
#4       1       D    ABCD
#5       2       B    BCDE
#6       2       C    BCDE
#7       2       D    BCDE
#8       2       E    BCDE

Or another option is data.table

library(data.table)
setDT(df1)[,  ColumnC := paste(ColumnB, collapse=""), by = ColumnA]

data

df1 <- structure(list(ColumnA = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), ColumnB = c("A", 
 "B", "C", "D", "B", "C", "D", "E")), .Names = c("ColumnA", "ColumnB"
 ), class = "data.frame", row.names = c(NA, -8L))

If we need python, then

>>> import pandas as pd;
>>> df1 = pd.read_clipboard()
>>> df1
#   ColumnA ColumnB
#1        1       A
#2        1       B
#3        1       C
#4        1       D
#5        2       B
#6        2       C
#7        2       D
#8        2       E
>>> df1['ColumnC'] = df1.groupby('ColumnA')['ColumnB'].transform(lambda x: ''.join(x))
>>> df1
#   ColumnA ColumnB ColumnC
#1        1       A    ABCD
#2        1       B    ABCD
#3        1       C    ABCD
#4        1       D    ABCD
#5        2       B    BCDE
#6        2       C    BCDE
#7        2       D    BCDE
#8        2       E    BCDE

Solution 2:

A one-liner in base R as suggested by @Sotos in the comment. Make sure that ColumnB of df is a character and not a factor for this solution.

with(df, ave(ColumnB, ColumnA, FUN = function(i) paste(i, collapse = '')))

Another base R solution:

df$ColumnC<-rep(unlist(by(df,INDICES = df$ColumnA,
function(t){paste(t$ColumnB,collapse = "")},simplify = F)),each=4)

>df
#ColumnA ColumnB ColumnC
#1       1       a    abcd
#2       1       b    abcd
#3       1       c    abcd
#4       1       d    abcd
#5       2       b    bcde
#6       2       c    bcde
#7       2       d    bcde
#8       2       e    bcde

Post a Comment for "Concatenate Rows In A Dataframe"