kedro - 在 Kedro 中，如何在管道中获取中间数据集？

Question

我正在研究我的管道并在 jupyter notebook 上手动测试它。

这是我的情况。

我想从中吸取教训example_train，example_valid所以我这样写。

context.pipeline.to_outputs("example_train", "example_valid")

并将另一个管道传递给 SequencialRunner，我得到了它们。

我也想要total_steps，所以我像这样改变了这条线。

context.pipeline.to_outputs("example_train", "example_valid", "total_steps")

但是，结果不包含exampe_train. 是的，我知道example_train不是这个修改管道的输出，所以它不包含。

有没有办法像这种情况一样获取中间数据集？

score 0 · Accepted Answer

您可以在数据目录中定义这些数据集，catalog.yml并定义它们的存储位置。

例如：

example_train:
  type: pandas.CSVDataSet
  filepath: data/02_intermediate/example_train.csv

1 回答 1