from pyspark.sql.window import Window
from pyspark.sql import functions as F
maxcol = func.udf(lambda row: F.max(row))
temp = [(("ID1", '2019-01-01', '2019-02-01')), (("ID2", '2018-01-01', '2019-05-01')), (("ID3", '2019-06-01', '2019-04-01'))]
t1 = spark.createDataFrame(temp, ["ID", "colA", "colB"])
maxDF = t1.withColumn("maxval", maxcol(F.struct([t1[x] for x in t1.columns[1:]])))
我想要的只是一个新列,其中包含 colA 和 ColB 的最大日期。我正在运行相同的代码,当我执行 maxDF.show 时,我遇到了以下错误:
'NoneType' object has no attribute '_jvm'