Sep-18-2021, 12:22 PM
(This post was last modified: Sep-18-2021, 01:25 PM by gunner1905.)
I have a dataframe which I read from an excel file. The thing is first 4 columns and its values look good. But after 5th column data seems kind of corrupted.
That is, the "dateID" values like "2021-09-06" became columns, "sourceOfData" column became ""values".
And it looks like that:
But i want my data to look like that:
The thing only came to my mind is pivot or melt. I started doing something like this:
Hope I explained my needs. Thanks in advance.
For those who want to reproduce my question and obtain dataframes, here is the code that consists of what i have and what i need.
That is, the "dateID" values like "2021-09-06" became columns, "sourceOfData" column became ""values".
And it looks like that:
But i want my data to look like that:
The thing only came to my mind is pivot or melt. I started doing something like this:
df2 = df.melt(var_name='dateID', value_name='productPrice')
df3 = df2.iloc[1:]in order to organize dates and prices, but I'm stuck.
Hope I explained my needs. Thanks in advance.
For those who want to reproduce my question and obtain dataframes, here is the code that consists of what i have and what i need.
import pandas as pd
whatIHave = {'countryName': ['','United States','Canada'],
'provinceName': ['','New York','Ontario'],
'productID': ['','35','55'],
'productName': ['', 'Sugar', 'Corn'],
'dateID': ['sourceOfData', 'CommissionAgent1', 'CommissionAgent1'],
'2021-09-06': ['productPrice','2.6$','2.6$'],
'2021-09-07': ['productPrice','5.5$','5.5$'],
'2021-09-08': ['productPrice','3.4$','3.4$']
}
df_whatIHave = pd.DataFrame(whatIHave, columns = ['countryName', 'provinceName', 'productID', 'productName', 'dateID', '2021-09-06', '2021-09-07', '2021-09-08'])
print(df_whatIHave)whatINeed = {'countryName': ['United States','United States','United States', 'Canada', 'Canada', 'Canada'],
'provinceName': ['New York','New York','New York', 'Ontario', 'Ontario', 'Ontario'],
'productID': ['35','35','35', '55', '55', '55'],
'productName': ['Sugar', 'Sugar', 'Sugar', 'Corn', 'Corn', 'Corn'],
'sourceOfData': ['CommissionAgent1', 'CommissionAgent1', 'CommissionAgent1', 'CommissionAgent1', 'CommissionAgent1', 'CommissionAgent1'],
'dateID': ['2021-09-06', '2021-09-07', '2021-09-08', '2021-09-06', '2021-09-07', '2021-09-08'],
'productPrice': ['2.6$','5.5$','3.4$','2.6$','5.5$','3.4$']
}
df_whatINeed = pd.DataFrame(whatINeed, columns = ['countryName', 'provinceName', 'productID', 'productName', 'sourceOfData', 'dateID', 'productPrice'])
print(df_whatINeed)
