我正在尝试通过 spark 将数据插入到 redis(Azure Cache for Redis)。大约有 7 亿行,我正在使用spark-redis连接器插入数据。它在一段时间抛出此错误后失败。我可以插入一些行,但一段时间后,一些任务开始失败并出现以下错误。我正在浏览 jupyter 笔记本。
Caused by: redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: Read timed out
at redis.clients.jedis.util.RedisInputStream.ensureFill(RedisInputStream.java:205)
at redis.clients.jedis.util.RedisInputStream.readByte(RedisInputStream.java:43)
at redis.clients.jedis.Protocol.process(Protocol.java:155)
at redis.clients.jedis.Protocol.read(Protocol.java:220)
at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:318)
at redis.clients.jedis.Connection.getStatusCodeReply(Connection.java:236)
at redis.clients.jedis.BinaryJedis.auth(BinaryJedis.java:2259)
at redis.clients.jedis.JedisFactory.makeObject(JedisFactory.java:119)
at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:819)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:429)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:360)
at redis.clients.jedis.util.Pool.getResource(Pool.java:50)
... 27 more
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.net.SocketInputStream.read(SocketInputStream.java:127)
at redis.clients.jedis.util.RedisInputStream.ensureFill(RedisInputStream.java:199)
... 38 more
这就是我尝试写入数据的方式。
df.write
.option("host", REDIS_URL)
.option("port", 6379)
.option("auth", <PWD>)
.option("timeout", 20000)
.format("org.apache.spark.sql.redis")
.option("table", "testrediskeys").option("key.column", "dummy").mode("overwrite").save()
Spark : 3.0
Scala : 2.12
spark-redis: com.redislabs:spark-redis_2.12:2.6.0