我有一个 mongoDB 集合,其中包含类似这样的文档
doc = {
"_id": {
"$oid": "516622c9ce21150200000d87"
},
"SubmissionDate": {
"$date": "2013-04-11T02:41:13.162Z"
},
"isComplete": True,
"Rounds": [
{
"Photo": [
],
"A": {
"Complexity": 55,
"Colour": 85,
"Deep": 51,
"Effervescence": 44
},
"B": {
"QualityPIDs": [
],
"QualityScales": [
],
"Complexity": 43,
"Qualities": [
]
},
"C": {
"QualityPIDs": [
],
"QualityScales": [
],
"Complexity": 60,
"UHS": 46,
"Colour": 33,
"Qualities": [
]
},
"D": {
"Complexity": 73,
"Duration": 68,
"Quality": 65
}
}
],
"Item": {
"_id": {
"$oid": "51e6d678c06918db21156f92"
},
"Country": "Australia",
"Name": "King",
"PeopleId": {
"$oid": "51dddb69a9d9350200000"
},
"Style": "Apple",
"Type": "Flat",
"UserSubmitted": False
}
}
我需要将此集合转换为熊猫数据框。
此处建议的解决方案如何将数据从 mongodb 导入到 pandas? 做主要工作。但是我仍然有 Rounds列,里面有字典的字典。
我做了一组循环以访问Rounds的子字典
df = pd.json_normalize(doc)
A_data = pd.DataFrame(columns=df.Rounds[0][0]['A'].keys())
for i in range(len(df.Rounds)):
A_data = A_data.append(pd.json_normalize(df.Rounds[0][0]['A']), ignore_index=True)
最后我将 A_data 连接到我的主数据框。
有更快的方法吗?现在循环需要很多时间。谢谢!