我正在尝试解析 .txt 文件并将其转换为 JSON 对象。这是 .txt 文件的样子:
0000000159900000000000 John Smith State Senator
0001001159915701270000DEM Jill Booker Governor
0002001159900000000000REP James Williams City Council
我想结束这样的事情:
obj = {
"name": State Senator,
"reporting_units": [
{
"state_postal": "IL",
"precincts_reporting": precincts_reporting,
"total_precincts": precincts_total
"candidates": [
{
"cand_name": "John Smith",
"cand_id": 0000000159900000000000
}
],
}
],
"name": Governor,
"reporting_units": [
{
"state_postal": "IL",
"precincts_reporting": precincts_reporting,
"total_precincts": precincts_total
"candidates": [
{
"cand_name": "Jill Booker",
"cand_id": 0001001159915701270000DEM
}
],
}
],
"name": City Council,
"reporting_units": [
{
"state_postal": "IL",
"precincts_reporting": precincts_reporting,
"total_precincts": precincts_total
"candidates": [
{
"cand_name": "James Williams",
"cand_id": 0002001159900000000000REP
}
],
}
]
}
如果 .txt 文件被分隔,这将是一个相对简单的循环,但事实并非如此。我通常会做的是这样的:
from bs4 import BeautifulSoup
import json
import requests
def scraper():
empty_list = []
# define object (as above)
with open('races.txt','r') as f:
data = f.readlines()
for row in data:
for item in row:
cand_id = row[0]
cand_name = row[1]
name = row[2]
# plug into object and append to list
scraper()
现在的问题是row[0]
,row[1]
并且row[2]
都是 0,因为我不能用某个分隔符分割行,所以它将一个字符读取为一个项目。我不知道如何编写它,以便它读取一个项目,而不是一个字符,作为一个项目。任何帮助在这里表示赞赏!