我有一个大的 CSV 文件,其中一行如下所示:
id_85,
{
"link": "some link",
"icon": "hello.gif",
"name": "Wall Photos",
"comments": {
"count": 0
},
"updated_time": "2012-03-12",
"object_id": "400",
"is_published": true,
"properties": [
{
"text": "University",
"name": "By",
"href": "some link"
}
],
"from": {
"id": "7778",
"name": "Let"
},
"message": "Hello World! :D",
"id": "id_85",
"created_time": "2012-03-12",
"to": {
"data": [
{
"id": "100",
"name": "March"
}
]
},
"message_tags": {
"0": [
{
"id": "100",
"type": "user",
"name": "Marcelo",
"length": 7,
"offset": 0
}
]
},
"type": "photo",
"caption": "Hello world!"
}
我试图在第一个和最后一个大括号之间获取它的 json 部分。
下面是我到目前为止的python regex 代码
import re
str = "id_85,{"link": "some link", "icon": "hello.gif", "name": "Wall Photos", "comments": {"count": 0}, "updated_time": "2012-03-12", "object_id": "400", "is_published": true, "properties": [{"text": "University", "name": "By", "href": "some link"}], "from": {"id": "777", "name": "Let"}, "message": "Hello World! :D", "id": "id_85", "created_time": "2012-03-12", "to": {"data": [{"id": "100", "name": "March"}]}, "message_tags": {"0": [{"id": "100", "type": "user", "name": "March", "length": 7, "offset": 0}]}, "type": "photo", "caption": "Hello world!"} "
m = re.match(r'.*,({.*}$)', str)
if m:
print m.group(1)
在某些情况下,它不使用第一个和最后一个大括号,例如 { ... } 。如何确保仅包含第一个和最后一个大括号之间的文本而不包含其他任何文本?
所需的输出如下所示:
{"link": "some link", "icon": "hello.gif", "name": "Wall Photos", "comments": {"count": 0}, "updated_time": "2012-03- 12", "object_id": "400", "is_published": true, "properties": [{"text": "University", "name": "By", "href": "some link"}], “来自”:{“id”:“777”,“名称”:“Let”},“消息”:“Hello World!:D”,“id”:“id_85”,“created_time”:“2012-03 -12", "to": {"data": [{"id": "100", "name": "March"}]}, "message_tags": {"0": [{"id": " 100",“type”:“user”,“name”:“March”,“length”:7,“offset”:0}]},“type”:“photo”,“caption”:“Hello world!”}
谢谢!