0

我想根据另一列的条件替换 DataFrame 的一列中的 NaN 。如果列[0]中有“乘客公里”,我想[1]用值“总客运量”填充该行另一列的 NaN,如下面的索引 14df所示(对于其他 NaN,还有一个替代方法,请参阅下面的映射totals_dict)。如果在下面尝试了这个循环,它在每种情况下都有效,但我想找到一个更优雅的解决方案。

totals_dict = {"Passenger-Kilometers": "Total passenger transport",
               "Freight Ton-Kilometers": "Total freight transport",}
for key, value in totals_dict.items():
    df[df[0] == key] = df[df[0] == key].fillna(value)

有没有更干净、不同的方法来解决这个问题?

或者,我试过:

df = df.groupby(0).assign(target_col=lambda group: group["target_col"].fillna(totals_dict.get(group[0])))

但不幸的是,groupby对象不接受assign作为方法。

df如下:

                       0                                         1
1          Vehicle Stock                Medium Trucks(10000 units)
2          Vehicle Stock                 Heavy Trucks(10000 units)
3          Vehicle Stock                       Trucks(10000 units)
4          Vehicle Stock      Mini Passenger Vehicles(10000 units)
5          Vehicle Stock     Small Passenger Vehicles(10000 units)
6          Vehicle Stock    Medium Passenger Vehicles(10000 units)
7          Vehicle Stock                 Light Trucks(10000 units)
8          Vehicle Stock     Large Passenger Vehicles(10000 units)
9          Vehicle Stock               Civil Vehicles(10000 units)
10  Passenger-Kilometers  Civil Aviation(100 million passenger-km)
11  Passenger-Kilometers       Waterways(100 million passenger-km)
12  Passenger-Kilometers        Highways(100 million passenger-km)
13  Passenger-Kilometers        Railways(100 million passenger-km)
14  Passenger-Kilometers                                      None
15         Vehicle Stock           Passenger Vehicles(10000 units)

谢谢!

4

1 回答 1

0

假设我有这个数据框:

>>> a
                      0                                         1
0  Passenger-Kilometers  Civil Aviation(100 million passenger-km)
1  Passenger-Kilometers       Waterways(100 million passenger-km)
2  Passenger-Kilometers                                      None
3  Passenger-Kilometers                                      None
4  Passenger-Kilometers                                      None

然后我可以运行以下命令:

def b(x):
    x[1] = "hello"
    return x
a[(a[0] == "Passenger-Kilometers") & (a[1].isnull())] = a[(a[0] == "Passenger-Kilometers") & (a[1].isnull())].apply(b, axis=1)

现在,如果我看:

>>> a
                      0                                         1
0  Passenger-Kilometers  Civil Aviation(100 million passenger-km)
1  Passenger-Kilometers       Waterways(100 million passenger-km)
2  Passenger-Kilometers                                     hello
3  Passenger-Kilometers                                     hello
4  Passenger-Kilometers                                     hello

所以你可以用你需要的任何东西替换“你好”

于 2021-03-03T14:34:16.540 回答