arrays - Using phoenix-spark plugin to insert an ARRAY Type

Question

I have a problem. I have a Spark RDD that I have to store inside an HBase table. We use the Apache-phoenix layer to dialog with the database. There a column of the table that is defined as an UNSIGNED_SMALLINT ARRAY:

CREATE TABLE EXAMPLE (..., Col10 UNSIGNED_SMALLINT ARRAY, ...);

As stated in the Phoenix documentation, that you can fine here, ARRAY data type is backend up by the java.sql.Array.

I'm using the phoenix-spark plugin to save data of the RDD inside the table. The problem is that I don't know how to create an instance of java.sql.Array, not having any kind of Connection object. An extract of the code follows (code is in Scala language):

// Map RDD into an RDD of sequences or tuples
rdd.map {
  value =>
    (/* ... */
     value.getArray(),   // Array of Int to convert into an java.sql.Array
     /* ... */
    )
}.saveToPhoenix("EXAMPLE", Seq(/* ... */, "Col10", /* ... */), conf, zkUrl)

Which is the correct way of go on? Is there a way to do want I need?

score 0 · Accepted Answer

凤凰城的人已经通过电子邮件回答了上述问题。我报告答案，为即将到来的人留下智慧。

要保存数组，您可以使用普通的旧 scala 数组类型。您可以查看测试示例： https ://github.com/apache/phoenix/blob/master/phoenix-spark/src/it/scala/org/apache/phoenix/spark/PhoenixSparkIT.scala#L408-L427

请注意，仅在 Phoenix 4.5.0中支持保存数组，尽管如果您需要自己应用该补丁非常小： https ://issues.apache.org/jira/browse/PHOENIX-1968

不错的答案。感谢凤凰城的伙计们。

arrays - Using phoenix-spark plugin to insert an ARRAY Type

1 回答 1

Related

Reference