你好! 亲爱的成员我想使用 Bigdl 训练模型,我有泡菜对象文件(,pck)形式的医学图像数据集。泡菜文件是 3D 图像(3D 数组)
我试图通过使用 BigDl python API 将其转换为 spark 数据帧
pickleRdd = sc.pickleFilehome/student/BigDL-
trainings/elephantscale/data/volumetric_data/329637-8.pck
sqlContext = SQLContext(sc)
df = sqlContext.createDataFrame(pickleRdd)
它抛出错误
Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 2, localhost, executor driver)
: java.io.IOException: file:/home/student/BigDL-trainings/elephantscale/data/volumetric_data/329637-8.pck not a SequenceFile
在这两种情况下,我都在 python 3.5 和 2.7 上执行了这段代码,我得到了错误