python - 将数据帧堆叠在一起

Question

假设我有以下数据框：

f1 = pd.DataFrame({
    'feature1': ['a', 'b','c'],
    'col2': [1,2,3],
    'col3': [0.1,0.2,0.3]
})

f2 = pd.DataFrame({
    'feature2': ['x', 'y','z'],
    'col2': [4,5,6],
    'col3': [0.4,0.5,0.6]
})

f3 = pd.DataFrame({
    'feature2': ['i', 'j','k'],
    'col2': [7,8,9],
    'col3': [0.7,0.8,0.9]
})

我想制作一个新的数据框来将三个数据框堆叠在一起，这样我得到：

  Feature   Col1  Col2
0   feature1  col2  col3
1          a     1   0.1
2          b     2   0.2
3          c     3   0.3
4   feature2  col2  col3
5          x     4   0.4
6          y     5   0.5
7          z     6   0.6
8   feature2  col2  col3
9          i     7   0.7
10         j     8   0.8
11         k     9   0.9

到目前为止，我一直在通过将每个数据框导出到 excel 中然后手动将它们复制并粘贴到新工作表中来做到这一点（因此，当我将最终的 excel 文件导入 Python 时，我可以获得所需的结果）。但我确信应该有一种方法可以在 Python 本身内有效地做到这一点？

score 2 · Accepted Answer

您可以使用函数，使用pandas.DataFrame.T和pandas.DataFrame.reset_index来pandas.DataFrame.set_axis预处理数据，然后pandas.concat：

def preprocess(df, new_cols=['Feature', 'Col1', 'Col2']):
    """
    Make columns the first row of the dataframe.
    And replace column names with `new_cols`.
    """
    return df.T.reset_index().T.set_axis(new_cols, axis='columns')

>>> pd.concat(map(preprocess, [f1, f2, f3]), ignore_index=True)

     Feature  Col1  Col2
0   feature1  col2  col3
1          a     1   0.1
2          b     2   0.2
3          c     3   0.3
4   feature2  col2  col3
5          x     4   0.4
6          y     5   0.5
7          z     6   0.6
8   feature2  col2  col3
9          i     7   0.7
10         j     8   0.8
11         k     9   0.9

在这里，preprocess(f1)给出：

>>> preprocess(f1)
        Feature  Col1  Col2
index  feature1  col2  col3
0             a     1   0.1
1             b     2   0.2
2             c     3   0.3

索引看起来不同，因此我们传递ignore_index=True给pandas.concat参数，它将结果索引转换为pandas.RangeIndex从开始0。

score 2 · Accepted Answer

用自定义功能做小调整

def ff(x):
    x = x.T.reset_index().T
    x.columns = ['feature','col1','col2']
    return x
out = pd.concat([ff(f1),ff(f2),ff(f3)]).reset_index(drop=True)

out
Out[96]: 
     feature  col1  col2
0   feature1  col2  col3
1          a     1   0.1
2          b     2   0.2
3          c     3   0.3
4   feature2  col2  col3
5          x     4   0.4
6          y     5   0.5
7          z     6   0.6
8   feature2  col2  col3
9          i     7   0.7
10         j     8   0.8
11         k     9   0.9

python - 将数据帧堆叠在一起

2 回答 2

Related

Reference