我有以下 JSON 结构:
{
"comments_v2": [
{
"timestamp": 1196272984,
"data": [
{
"comment": {
"timestamp": 1196272984,
"comment": "OSI Beach Party Weekend, CA",
"author": "xxxx"
}
}
],
"title": "xxxx commented on his own photo."
},
{
"timestamp": 1232918783,
"data": [
{
"comment": {
"timestamp": 1232918783,
"comment": "We'll see about that.",
"author": "xxxx"
}
}
]
}
]
}
我正在尝试将此 JSON 扁平化为 pandas 数据框,这是我的解决方案:
# Read file
df = pd.read_json(codecs.open(infile, "r", "utf-8-sig"))
# Normalize
df = pd.json_normalize(df["comments_v2"])
child_column = pd.json_normalize(df["data"])
child_column = pd.concat([child_column.drop([0], axis=1), child_column[0].apply(pd.Series)], axis=1)
df_merge = df.join(child_column)
df_merge.drop(["data"], axis=1, inplace=True)
生成的数据框如下:
时间戳 | 标题 | 评论.时间戳 | 评论.comment | 评论作者 | 评论组 |
---|---|---|---|---|---|
1196272984 | xxxx 评论了他自己的照片 | 1196272984 | OSI 海滩派对周末,加利福尼亚 | XXXXXX | 钠 |
有没有更简单的方法来扁平化 JSON 以获得上面显示的结果?
谢谢!