我有一个数据集,其中包含两列带有冻结集的列。现在我想合并/合并这些frozensets。我可以使用 for 循环来做到这一点,但是我的数据集包含超过 2700 万行,所以我正在寻找一种避免 for 循环的方法。有人有什么想法吗?
数据
import pandas as pd
import numpy as np
d = {'ID1': [frozenset(['a', 'b']), frozenset(['a','c']), frozenset(['c','d'])],
'ID2': [frozenset(['c', 'g']), frozenset(['i','f']), frozenset(['t','l'])]}
df = pd.DataFrame(data=d)
带有 for 循环的代码
from functools import reduce
df['frozenset']=0
for i in range(len(df)):
df['frozenset'].iloc[i] = reduce(frozenset.union, [df['ID1'][i],df['ID2'][i]])
期望的输出
ID1 ID2 frozenset
0 (a, b) (c, g) (a, c, g, b)
1 (a, c) (f, i) (a, c, f, i)
2 (c, d) (t, l) (c, d, t, l)