我们正在使用 emr-5.26.0 在 EMR 上运行 spark 作业,该作业在 EMR 集群上运行良好。现在我们决定将 EMR On EKS 与 fargate 配置文件一起使用。当前工作:
spark-submit --deploy-mode cluster --driver-memory 10g --executor-memory 10g --files "s3://stackoverflow1-analytics/nba/omnichannel/module/Pickle_3D_SD.sav,s3://stackoverflow1-analytics /nba/omnichannel/module/to_pickle_3D.py,s3://stackoverflow1-analytics/nba/omnichannel/config/rule_nba1.json" --conf "spark.yarn.dist.archives=s3://stackoverflow1-analytics/nba /omnichannel/module/libs.zip,s3://stackoverflow1-analytics/nba/omnichannel/module/app_bp_keras1.zip" --conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=python3.6 --py-files "app_bp_keras1.zip ,libs.zip" s3://stackoverflow1-analytics/nba/omnichannel/module/wrapper3D_legacy.py --partitions 1000 --rule-file "rule_nba1.json"
但现在我们正在迁移到 EMR ON EKS 因为我做了如下更改工作以在 EMR On EKS 上运行:
aws emr-containers start-job-run
--virtual-cluster-id ofw8xux19xom1s3tvfyy6y9jr
--name nbaTest
--execution-role-arn arn:aws:iam::307142429795:role/aws-a9006-glbl-00-d-rol-verso-shr-eks01
--release-label emr-6.2.0-latest
--job-driver '{"sparkSubmitJobDriver": {"entryPoint": "s3://stackoverflow1-analytics/nba/omnichannel/module/wrapper3D_legacy.py","sparkSubmitParameters": "--py-files s3://stackoverflow1-analytics/nba/omnichannel/module/libs.zip,s3://stackoverflow1-analytics/nba/omnichannel/module/app_bp_keras1.zip --conf spark.hadoop.fs.s3a.access.key=XXXXXXXXXXXXXXX --conf spark.hadoop.fs.s3a.secret.key=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX --conf spark.executor.instances=2 --conf spark.executor.memory=10G --conf spark.executor.cores=2 --conf spark.driver.cores=2 --conf spark.submit.deployMode=cluster"}}'
--configuration-overrides '{"applicationConfiguration":[{"classification": "spark-defaults","properties": {"spark.sql.shuffle.partitions":"1000","spark.files":"s3://stackoverflow1-analytics/nba/omnichannel/module/Pickle_3D_SD.sav,s3://stackoverflow1-analytics/nba/omnichannel/module/to_pickle_3D.py,s3://stackoverflow1-analytics/nba/omnichannel/config/rule_nba1.json","spark.driver.memory": "10G","spark.dynamicAllocation.enabled":"true","spark.dynamicAllocation.shuffleTracking.enabled":"true","spark.dynamicAllocation.minExecutors":"2","spark.dynamicAllocation.maxExecutors":"100","spark.dynamicAllocation.initialExecutors":"5"}}], "monitoringConfiguration": {"cloudWatchMonitoringConfiguration": {"logGroupName": "EMROnEKS","logStreamNamePrefix": "nba_"}, "s3MonitoringConfiguration": {"logUri": "s3://stackoverflow1-analytics/emr-eks-logs/emr-eks-logs/gdt/"}}}'
I am running above job on emr-6.2.0-latest version as because Amazon EMR versions 5.32.0 and 6.2.0, you can deploy Amazon EMR on EKS.But my job is failing with below error..
T**raceback (most recent call last):
File "/tmp/spark-6b1640d4-9d39-4397-a737-9c0e5d584ab0/libs.zip/numpy/core/__init__.py", line 40, in <module>
File "/tmp/spark-6b1640d4-9d39-4397-a737-9c0e5d584ab0/libs.zip/numpy/core/multiarray.py", line 12, in <module>
File "/tmp/spark-6b1640d4-9d39-4397-a737-9c0e5d584ab0/libs.zip/numpy/core/overrides.py", line 6, in <module>
ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'
ImportError:
**Original error was: No module named 'numpy.core._multiarray_umath'**
我与 AWS 支持团队进行了交谈。他们怀疑库中可能存在与版本更改相关的一些 python 依赖问题。任何帮助将不胜感激解决这个问题..