Nov-28-2020, 05:36 PM
(This post was last modified: Nov-28-2020, 09:01 PM by Larz60+.
Edit Reason: added proper code tags
)
My apologies for the similar question asked previously. This question is in Python. But I can't find correct solution I have the following dataframe df1
It brings in the first key as well like:
I had tried
Output:SomeJson
[{ "Number": "1234", "Color": "blue", "size": "Medium" }, { "Number": "2222", "Color": "red", "size": "Small" } ] and I am trying to write just the contents of this dataframe as a json.df0.coalesce(300).write.mode('append').json(<json_Path>)It brings in the first key as well like:
{
"SomeJson": [{
"Number": "1234",
"Color": "blue",
"size": "Medium"
}, {
"Number": "2222",
"Color": "red",
"size": "Small"
}
]
}but, I would not like to have { "SomeJson": } this in the output file. I have tried to write below. But, I am getting lost at writing the custom Python function to eliminate the first header. Any assistance is highly appreciateddf0.rdd.map(<custom_function>).saveAsTextFile(<json_Path>)I had tried
df0.rdd.map(lambda x: json.dumps(x["SomeJson"])).saveAsTextFile("filepath") but this gives only values but not keys in Square brackets..also, I would like to remove SomeJson from the output. Any help is much appreciated
