这是 API 提供给我的 JSON 对象之一的示例。其中有 100 个。
[{"id": "133248644",
"associations": {"deals": {"results": [{"id": "2762673039",
"type": "line_item_to_deal"}]}},
"properties": {
"createdate": "2020-08-06T15:05:23.253Z",
"description": null,
"hs_lastmodifieddate": "2020-08-06T15:05:23.253Z",
"hs_object_id": "133248644",
"name": "test product",
"price": "100"},
"createdAt": "2020-08-06T15:05:23.253Z",
"updatedAt": "2020-08-06T15:05:23.253Z",
"archived": false}]
除了嵌套在“关联”下的 id 之外,我想创建一个 pandas 数据框,它有一个 id 列以及与之关联的所有属性。本质上,我想删除嵌套在属性下的属性和嵌套在关联下的 id(以及重命名)。我该怎么办?
这是我尝试解决问题的可重现示例:
import json
import pandas as pd
response = """[{"id": "133248644",
"associations": {"deals": {"results": [{"id": "2762673039",
"type": "line_item_to_deal"}]}},
"properties": {
"createdate": "2020-08-06T15:05:23.253Z",
"description": null,
"hs_lastmodifieddate": "2020-08-06T15:05:23.253Z",
"hs_object_id": "133248644",
"name": "test product",
"price": "100"},
"createdAt": "2020-08-06T15:05:23.253Z",
"updatedAt": "2020-08-06T15:05:23.253Z",
"archived": false},
{"id": "133345685",
"associations": {"deals": {"results": [{"id": "2762673038",
"type": "line_item_to_deal"}]}},
"properties": {
"createdate":
"2020-08-06T18:29:06.773Z",
"description": null,
"hs_lastmodifieddate": "2020-08-06T18:29:06.773Z",
"hs_object_id": "133345685",
"name": "TEST PRODUCT 2",
"price": "2222"},
"createdAt": "2020-08-06T18:29:06.773Z",
"updatedAt": "2020-08-06T18:29:06.773Z",
"archived": false}]"""
data = json.loads(response)
data_flat = [dict(id=x["id"], **x["properties"]) for x in data]
这是一个更好的解决方案,但仍然不是很完美。
data_flat = [dict(lineid=x["id"],dealid=x["associations"]["deals"]["results"][0]["id"], **x["properties"]) for x in data]
最后,这非常有用,但仍然需要我以一种复杂的方式从关联列中提取 id。
normal_data = pd.normalize_data(data)