Feb-07-2020, 02:10 PM
Hello guys!
I have a dataframe.
But chains are not unique. (A-B-C = B-C-A = C-A-B <> B-A-C)
I can’t catch using which methods in python I have to sort(using parallel shift) units of all chains to drop duplicates.
Could please anybody help?
I have a dataframe.
import pandas as pd
data = {'num':[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
'name':['281.3891.3891.281',
'3891.281.281.3891',
'1162.5645.5645.500835.500835.1162',
'5645.500835.500835.1162.1162.5645',
'500835.1162.1162.5645.5645.500835',
'1349.1162.1162.5645.5645.500835.500835.1349',
'1162.5645.5645.500835.500835.1349.1349.1162',
'5645.500835.500835.1349.1349.1162.1162.5645',
'500835.1349.1349.1162.1162.5645.5645.500835',
'5645.1162.1162.500835.500835.5645',
'1162.500835.500835.5645.5645.1162',
'500835.5645.5645.1162.1162.500835'
]}
df = pd.DataFrame(data)
print(df)Each line in dataframe is a chain (start point = end point).But chains are not unique. (A-B-C = B-C-A = C-A-B <> B-A-C)
I can’t catch using which methods in python I have to sort(using parallel shift) units of all chains to drop duplicates.
Could please anybody help?
