我们有一个小型的关键 Hadoop-hawq 集群。我们在其上创建了外部表并指向 hadoop 文件。
给定环境:
产品版本:(HAWQ 1.3.0.2 build 14421)在 x86_64-unknown-linux-gnu 上,由 GCC gcc (GCC) 4.4.2 编译
试过:
当我们尝试使用命令从外部表中读取数据时。IE
test=# select count(*) from EXT_TAB ; GETTING following error : ERROR: data line too long. likely due to invalid csv data (seg0 slice1 SEG0.HOSTNAME.COM:40000 pid=447247)
DETAIL: External table trcd_stg0, line 12059 of pxf://hostname/tmp/def_rcd/?profile=HdfsTextSimple: "2012-08-06 00:00:00.0^2012-08-06 00:00:00.0^6552^2016-01-09 03:15:43.427^0005567^COMPLAINTS ..." :
附加信息:
外部表的 DDL 是:
CREATE READABLE EXTERNAL TABLE sysprocompanyb.trcd_stg0
(
"DispDt" DATE,
"InvoiceDt" DATE,
"ID" INTEGER,
time timestamp without time zone,
"Customer" CHAR(7),
"CustomerName" CHARACTER VARYING(30),
"MasterAccount" CHAR(7),
"MasterAccName" CHAR(30),
"SalesOrder" CHAR(6),
"SalesOrderLine" NUMERIC(4, 0),
"OrderStatus" CHAR(200),
"MStockCode" CHAR(30),
"MStockDes" CHARACTER VARYING(500),
"MWarehouse" CHAR(200),
"MOrderQty" NUMERIC(10, 3),
"MShipQty" NUMERIC(10, 3),
"MBackOrderQty" NUMERIC(10, 3),
"MUnitCost" NUMERIC(15, 5),
"MPrice" NUMERIC(15, 5),
"MProductClass" CHAR(200),
"Salesperson" CHAR(200),
"CustomerPoNumber" CHAR(30),
"OrderDate" DATE,
"ReqShipDate" DATE,
"DispatchesMade" CHAR(1),
"NumDispatches" NUMERIC(4, 0),
"OrderValue" NUMERIC(26, 8),
"BOValue" NUMERIC(26, 8),
"OrdQtyInEaches" NUMERIC(21, 9),
"BOQtyInEaches" NUMERIC(21, 9),
"DispQty" NUMERIC(38, 3),
"DispQtyInEaches" NUMERIC(38, 9),
"CustomerClass" CHAR(200),
"MLineShipDate" DATE
)
LOCATION (
'pxf://HOSTNAME-HA/tmp/def_rcd/?profile=HdfsTextSimple'
)
FORMAT 'CSV' (delimiter '^' null '' escape '"' quote '"')
ENCODING 'UTF8';
任何帮助将非常感激 ?