0

我们正在使用 emr-5.26.0 在 EMR 上运行 spark 作业,该作业在 EMR 集群上运行良好。现在我们决定将 EMR On EKS 与 fargate 配置文件一起使用。当前工作:

spark-submit --deploy-mode cluster --driver-memory 10g --executor-memory 10g --files "s3://stackoverflow1-analytics/nba/omnichannel/module/Pickle_3D_SD.sav,s3://stackoverflow1-analytics /nba/omnichannel/module/to_pickle_3D.py,s3://stackoverflow1-analytics/nba/omnichannel/config/rule_nba1.json" --conf "spark.yarn.dist.archives=s3://stackoverflow1-analytics/nba /omnichannel/module/libs.zip,s3://stackoverflow1-analytics/nba/omnichannel/module/app_bp_keras1.zip" --conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=python3.6 --py-files "app_bp_keras1.zip ,libs.zip" s3://stackoverflow1-analytics/nba/omnichannel/module/wrapper3D_legacy.py --partitions 1000 --rule-file "rule_nba1.json"

但现在我们正在迁移到 EMR ON EKS 因为我做了如下更改工作以在 EMR On EKS 上运行:

 aws emr-containers start-job-run 
 --virtual-cluster-id ofw8xux19xom1s3tvfyy6y9jr 
 --name nbaTest 
 --execution-role-arn arn:aws:iam::307142429795:role/aws-a9006-glbl-00-d-rol-verso-shr-eks01 
 --release-label emr-6.2.0-latest 
 --job-driver '{"sparkSubmitJobDriver": {"entryPoint": "s3://stackoverflow1-analytics/nba/omnichannel/module/wrapper3D_legacy.py","sparkSubmitParameters": "--py-files s3://stackoverflow1-analytics/nba/omnichannel/module/libs.zip,s3://stackoverflow1-analytics/nba/omnichannel/module/app_bp_keras1.zip --conf spark.hadoop.fs.s3a.access.key=XXXXXXXXXXXXXXX --conf spark.hadoop.fs.s3a.secret.key=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX  --conf spark.executor.instances=2 --conf spark.executor.memory=10G --conf spark.executor.cores=2 --conf spark.driver.cores=2 --conf spark.submit.deployMode=cluster"}}' 
 --configuration-overrides '{"applicationConfiguration":[{"classification": "spark-defaults","properties": {"spark.sql.shuffle.partitions":"1000","spark.files":"s3://stackoverflow1-analytics/nba/omnichannel/module/Pickle_3D_SD.sav,s3://stackoverflow1-analytics/nba/omnichannel/module/to_pickle_3D.py,s3://stackoverflow1-analytics/nba/omnichannel/config/rule_nba1.json","spark.driver.memory": "10G","spark.dynamicAllocation.enabled":"true","spark.dynamicAllocation.shuffleTracking.enabled":"true","spark.dynamicAllocation.minExecutors":"2","spark.dynamicAllocation.maxExecutors":"100","spark.dynamicAllocation.initialExecutors":"5"}}], "monitoringConfiguration": {"cloudWatchMonitoringConfiguration": {"logGroupName": "EMROnEKS","logStreamNamePrefix": "nba_"}, "s3MonitoringConfiguration": {"logUri": "s3://stackoverflow1-analytics/emr-eks-logs/emr-eks-logs/gdt/"}}}'
                    
 I am running above job on emr-6.2.0-latest version as because Amazon EMR versions 5.32.0 and 6.2.0, you can deploy Amazon EMR on EKS.But my job is failing with below error..
                    
   T**raceback (most recent call last):
   File "/tmp/spark-6b1640d4-9d39-4397-a737-9c0e5d584ab0/libs.zip/numpy/core/__init__.py", line 40, in <module>
   File "/tmp/spark-6b1640d4-9d39-4397-a737-9c0e5d584ab0/libs.zip/numpy/core/multiarray.py", line 12, in <module>
   File "/tmp/spark-6b1640d4-9d39-4397-a737-9c0e5d584ab0/libs.zip/numpy/core/overrides.py", line 6, in <module>
   ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'
                
  ImportError: 
                   
 **Original error was: No module named 'numpy.core._multiarray_umath'**
            

我与 AWS 支持团队进行了交谈。他们怀疑库中可能存在与版本更改相关的一些 python 依赖问题。任何帮助将不胜感激解决这个问题..

4

0 回答 0