我正在尝试将图层属性添加到我的目录中。我有一个常见的模式是获取一些数据(raw
),清理它,然后输出一个部件列表(pri
)。然后,我需要那些部分的元数据,我从中获取部分列表pri
并传递给获取数据的函数 ( raw
)。管道本身不是圆形的,但是当我创建圆形图层时,kedro 似乎不喜欢。
对于这个用例,我是否缺少一个常见的模式?
是否可以允许图层为圆形?
例子
我试图在下面整理一个通用示例。
raw_truck_sales:
type: pandas.ParquetDataSet
filepath: <filepath>
layer: raw
int_truck_sales:
type: pandas.ParquetDataSet
filepath: <filepath>
layer: int
pri_truck_sales:
type: pandas.ParquetDataSet
filepath: <filepath>
layer: pri
pri_truck_sold_models:
type: pandas.ParquetDataSet
filepath: <filepath>
layer: pri
raw_truck_metadata:
type: pandas.ParquetDataSet
filepath: <filepath>
layer: raw
int_truck_metadata:
type: pandas.ParquetDataSet
filepath: <filepath>
layer: int
pri_truck_metadata:
type: pandas.ParquetDataSet
filepath: <filepath>
layer: pri
nodes = [
node(
get_truck_sales,
inputs=None,
outputs='raw_truck_sales',
),
node(
create_int_truck_sales,
inputs='raw_truck_sales',
outputs='int_truck_sales',
),
node(
create_pri_truck_sales,
inputs='int_truck_sales',
outputs='pri_truck_sales',
),
node(
lambda truck_sales: truck_sales[['model']],
inputs='pri_truck_sales',
outputs='pri_truck_models_sold',
),
# This node takes the list of trucks sold and gets metadata for them
# It seems to break kedros layers model by creating a circular reference
node(
get_truck_metadata,
inputs='pri_truck_models_sold',
outputs='raw_truck_metadata',
),
node(
create_int_truck_metadata,
inputs='raw_truck_metadata',
outputs='int_truck_metadata',
),
node(
create_pri_truck_metadata,
inputs='int_truck_metadata',
outputs='pri_truck_metadata',
),
]