mardi 12 juillet 2016

Pandas dataframe combine duplicate columns into one- separate data by comma

My dataframe has few duplicate column names. If a duplicate column name is found combine duplicate columns into one. I also want to retain duplicate columns data separated by comma. Can anyone please suggest a way to do this.

I have constructed an example below. In my actual dataframe column names are unknown.

Input DataFrame:

-------------------------------------
Index | Col1 | Col2 | Col3 | Col2 |
-------------------------------------

 A  | CA1   | CA2   | CA3 | CA5 |

 B  | CB1   | CB2   | CB3 | CB5 |

 C  | CC1   | CC2   | CC3 | CC5 |

 D  | CD1   | CD2   | CD3 | CD5 |

 E  | CE1   | CE2   | CE3 | CE5 |

-------------------------------------

Output DataFrame:

------------------------------
Index | Col1 | Col2 | Col3 |
------------------------------

 A  | CA1   | CA2,CA5   | CA3 |

 B  | CB1   | CB2,CB5   | CB3 | 

 C  | CC1   | CC2,CC5   | CC3 | 

 D  | CD1   | CD2,CD5   | CD3 | 

 E  | CE1   | CE2,CE5   | CE3 |

---------------------------

Aucun commentaire:

Enregistrer un commentaire