I have a dataframe as shown below. I want to merge the lists if they have atleast one same value. It is okay to take any of the component number. For example, [1,2] and [1,4,9] has 1 as common value. So both will be merged to [1,2,4,9]. Now [1,2] has component number 80 and and [1,4,9] has component number 30. For [1,2,4,9] it is okay to have any one of them as component number. In the example given below, I have considered 30.
It is possible to have a solution using dataframe or rdd operation avoiding as much iteration as possible? Thanks.
Aucun commentaire:
Enregistrer un commentaire