python - 从 csv 读取项目并更新另一个 csv 中的相同项目

Question

我正在研究一种方法来读取数据，并根据产品的input.csv更新stock列output.csvid

这些是我现在正在执行的步骤：

1.input.csv从into读取产品信息input_data = []，这将返回一个 OrderedDict 列表。

input_data目前看起来像这样：

[OrderedDict([('id', '1'), ('name', 'a'), ('stock', '33')]), OrderedDict([('id', '2'), ('name', 'b'), ('stock', '66')]), OrderedDict([('id', '3'), ('name', 'c'), ('stock', '99')])]

2.output.csv从into读取当前产品信息output_data = []，其架构与input_data

3.根据中的库存信息，遍历input_data并更新中的stock列。最好的方法是什么？output_datainput_data

-> 重要的一点是，input_data 其中可能存在一些 ID，input_data但不存在于output_data. 我想更新 s和s 共同的股票，而“新” s 很可能会被写入新的 csv。idinput_dataoutput_dataid

我在想类似的东西（这不是真正的代码）：

for p in input_data:
    # check if p['id'] exists in the list of output_data IDs (I might have to create a list of IDs in output_data for this as well, in order to check it against input_data IDs
    # if p['id'] exists in output_data, write the Stock to the corresponding product in output_data
    # else, append p to another_csv

我知道这看起来很混乱，我要的是一种合乎逻辑的方式来完成这项任务，而不会浪费太多的计算时间。有问题的文件可能有 100,000 行长，因此性能和速度将是一个问题。

如果我的数据来自input_data和，那么签入并将其写入具有完全相同in的产品的最佳方法output_data是什么？listOrderedDictidinput_datastockidoutput_data

score 1 · Accepted Answer

虽然 Python 可能不是您的最佳选择，但我不会为此任务使用 OrderDict 列表。这仅仅是因为尝试在其中更改某些内容output_data需要 O(n) 复杂性，这将简单地将您的脚本转换为 O(n**2)。我会将这两个文件保存在 dicts 中（如果您关心订单，则为 OrderedDicts），就像这样（并将整个事情的复杂性降低到 O(n)）：

input_data = {
    '1': ['a', '33'],
    '2': ['b', '66'],
    '3': ['c', '99']
}
output_data = {
    '1': ['a', '31'],
    '3': ['c', '95']
}

# iterate through all keys in input_data and update output_data
# if a key does not exist in output_data, create it in a different dict
new_data = {}
for key in input_data:
    if key not in output_data:
        new_data[key] = input_data[key]
        # for optimisation's sake you could append data into the new file here
        # and not save into a new dict
    else:
        output_data[key][1] = input_data[key][1]
        # for optimisation's sake you could append data into a new output file here
        # and rename/move the new output file into the old output file after the script finishes

python - 从 csv 读取项目并更新另一个 csv 中的相同项目

1 回答 1

Related

Reference