9

我拉取了 docker 镜像并执行了下面的命令来运行镜像。

  1. docker run -it bitnami/spark:latest /bin/bash

  2. spark-shell --packages="org.elasticsearch:elasticsearch-spark-20_2.11:7.5.0"

我收到如下消息

Ivy Default Cache set to: /opt/bitnami/spark/.ivy2/cache
The jars for the packages stored in: /opt/bitnami/spark/.ivy2/jars
:: loading settings :: url = jar:file:/opt/bitnami/spark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.elasticsearch#elasticsearch-spark-20_2.11 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-c785f3e6-7c78-469f-ab46-451f8be61a4c;1.0
        confs: [default]
Exception in thread "main" java.io.FileNotFoundException: /opt/bitnami/spark/.ivy2/cache/resolved-org.apache.spark-spark-submit-parent-c785f3e6-7c78-469f-ab46-451f8be61a4c-1.0.xml (No such file or directory)
        at java.io.FileOutputStream.open0(Native Method)
        at java.io.FileOutputStream.open(FileOutputStream.java:270)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
        at org.apache.ivy.plugins.parser.xml.XmlModuleDescriptorWriter.write(XmlModuleDescriptorWriter.java:70)
        at org.apache.ivy.plugins.parser.xml.XmlModuleDescriptorWriter.write(XmlModuleDescriptorWriter.java:62)
        at org.apache.ivy.core.module.descriptor.DefaultModuleDescriptor.toIvyFile(DefaultModuleDescriptor.java:563)
        at org.apache.ivy.core.cache.DefaultResolutionCacheManager.saveResolvedModuleDescriptor(DefaultResolutionCacheManager.java:176)
        at org.apache.ivy.core.resolve.ResolveEngine.resolve(ResolveEngine.java:245)
        at org.apache.ivy.Ivy.resolve(Ivy.java:523)
        at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1300)
        at org.apache.spark.deploy.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:54)
        at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:304)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:774)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

我尝试了其他软件包,但它无法处理所有相同的错误消息。

你能给出一些建议来避免这个错误吗?

4

2 回答 2

2

在https://github.com/bitnami/bitnami-docker-spark/issues/7中找到了解决方案, 我们要做的是在映射到 docker 路径的主机上创建一个卷

volumes:
  - ./jars_dir:/opt/bitnami/spark/ivy:z

将此路径作为缓存路径,如下所示

spark-shell --conf spark.jars.ivy=/opt/bitnami/spark/ivy --conf spark.cassandra.connection.host=127.0.0.1 --packages com.datastax.spark:spark-cassandra-connector_2.12 :3.0.0-beta --conf spark.sql.extensions=com.datastax.spark.connector.CassandraSparkExtensions

一切都发生了,因为 /opt/bitnami/spark 不可写,我们必须安装一个卷来绕过它。

于 2020-07-16T08:39:16.643 回答
1

发生错误“java.io.FileNotFoundException: /opt/bitnami/spark/.ivy2/”是因为位置 /opt/bitnami/spark/ 不可写。所以为了解决这个问题,请像这样修改主火花服务。将用户添加为 root 并为所需的 jar 添加挂载的卷路径。

查看用 docker compose 编写的 spark 服务的工作块

spark:
image: docker.io/bitnami/spark:3
container_name: spark
environment:
  - SPARK_MODE=master
  - SPARK_RPC_AUTHENTICATION_ENABLED=no
  - SPARK_RPC_ENCRYPTION_ENABLED=no
  - SPARK_LOCAL_STORAGE_ENCRYPTION_ENABLED=no
  - SPARK_SSL_ENABLED=no
user: root
ports:
  - '8880:8080'
volumes:
  - ./spark-defaults.conf:/opt/bitnami/spark/conf/spark-defaults.conf
  - ./jars_dir:/opt/bitnami/spark/ivy:z
于 2021-08-30T07:50:35.847 回答