是否有人在2020 年为使用 Hadoop 3.x+ 的 Azure 交互式查询 HDI 4.0 集群提供了有效且经过测试的 TPC-DS 或 TPC-H 基准?
我使用的是https://github.com/hortonworks/hive-testbench,但在尝试为 TPC-H 和 TPC-DS 生成数据时遇到了错误。
交互式查询 HDI 4.0 (Hadoop 3.1.1)。这个错误可能是什么?失败的步骤是它运行 jar 文件时。
Generating data at scale factor 100.
Exception in thread "main" java.lang.IllegalAccessError:
class org.apache.hadoop.hdfs.web.HftpFileSystem cannot access its superinterface org.apache.hadoop.hdfs.web.TokenAspect$TokenManagementDelegator
...
ls: `/tmp/tpch-generate/100/lineitem': No such file or directory
Data generation failed, exiting.
然后第二个问题是,对于 TPC-DS,每当我运行“大”比例因子时,优化步骤都会出现故障。它通常在表 17 或 18 上失败。关于这可能是什么的任何想法?
INFO : Loading data to table tpcds_bin_partitioned_orc_100.store_sales partition (ss_sold_date_sk=null) from wasb://asdasd-2020-04-16t02-32-03-034z@asdasd.blob.core.windows.net/hive/warehouse/managed/tpcds_bin_partitioned_orc_100.db/store_sales/.hive-staging_hive_2020-04-16_06-47-19_242_1371829803314907581-47/-ext-10000
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Exception updating metastore for acid table tpcds_bin_partitioned_orc_100.store_sales with partitions [store_sales
...
INFO : Completed executing command(queryId=hive_20200416064719_4aa11ffb-31c0-411f-a7ca-954c9741891d); Time taken: 1280.036 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Exception updating metastore for acid table tpcds_bin_partitioned_orc_100.store_sales with partitions