0

我想从 Beam SQL (SqlTransform) 的输出中获得带有嵌套行的行,但失败了。

问题:

  1. 从 SqlTransform 输出带有嵌套行的行的正确方法是什么?(行类型在文档中有所描述,所以我相信它是受支持的)
  2. 如果这是一个错误/缺失的功能,是 Beam 本身的问题吗?还是依赖跑步者?(我目前在 DirectRunner 上使用,但将来会使用 DataflowRunner。)

版本信息:

  • 操作系统:macOS 10.15.7 (Catalina)
  • Java:11.0.11(采用OpenJDK)
  • 光束 SDK:2.32.0

这是我尝试过的,没有运气。

方言方言

SELECT ROW(foo, bar) as my_nested_row FROM PCOLLECTION

我期待这个输出行具有以下架构

Field{name=my_nested_row, description=, type=ROW<foo STRING NOT NULL, bar INT64 NOT NULL> NOT NULL, options={{}}}

但实际上行被分成标量字段,如

Field{name=my_nested_row$$0, description=, type=STRING NOT NULL, options={{}}}
Field{name=my_nested_row$$1, description=, type=INT64 NOT NULL, options={{}}}

泽塔 SQL

SELECT STRUCT(foo, bar) as my_nested_row FROM PCOLLECTION

我有一个错误

java.lang.UnsupportedOperationException: Does not support expr node kind RESOLVED_MAKE_STRUCT
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromResolvedExpr (ExpressionConverter.java:363)
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromResolvedExpr (ExpressionConverter.java:323)
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromComputedColumnWithFieldList (ExpressionConverter.java:375)
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.retrieveRexNode (ExpressionConverter.java:203)
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.ProjectScanConverter.convert (ProjectScanConverter.java:45)
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.ProjectScanConverter.convert (ProjectScanConverter.java:29)
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convertNode (QueryStatementConverter.java:102)
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convert (QueryStatementConverter.java:89)
    at org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convertRootQuery (QueryStatementConverter.java:55)
    at org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLPlannerImpl.rel (ZetaSQLPlannerImpl.java:98)
    at org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRelInternal (ZetaSQLQueryPlanner.java:197)
    at org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRel (ZetaSQLQueryPlanner.java:185)
    at org.apache.beam.sdk.extensions.sql.impl.BeamSqlEnv.parseQuery (BeamSqlEnv.java:111)
    at org.apache.beam.sdk.extensions.sql.SqlTransform.expand (SqlTransform.java:171)
    at org.apache.beam.sdk.extensions.sql.SqlTransform.expand (SqlTransform.java:109)
    at org.apache.beam.sdk.Pipeline.applyInternal (Pipeline.java:548)
    at org.apache.beam.sdk.Pipeline.applyTransform (Pipeline.java:482)
    at org.apache.beam.sdk.values.PCollection.apply (PCollection.java:363)
    at dev.tmshn.playbeam.Main.main (Main.java:29)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:566)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:282)
    at java.lang.Thread.run (Thread.java:829)
4

1 回答 1

1

不幸的是,Beam SQL 还不支持嵌套行,主要是由于 Calcite 缺乏支持(因此相应地缺乏对 ZetaSQL 实现的支持)。请参阅关注 Dataflow 的类似问题

从好的方面来说,跟踪此支持的 Jira 问题似乎已在 2.34.0 中得到解决,因此可能即将提供适当的支持。

于 2021-09-23T23:59:51.360 回答