我正在尝试使用可以在此处找到的一些 Google Web Graph 数据创建一个图表:
https://snap.stanford.edu/data/web-Google.html
import org.apache.spark._
import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD
val textFile = sc.textFile("hdfs://n018-data.hursley.ibm.com/user/romeo/web-Google.txt")
val arrayForm = textFile.filter(_.charAt(0)!='#').map(_.split("\\s+")).cache()
val nodes = arrayForm.flatMap(array => array).distinct().map(_.toLong)
val edges = arrayForm.map(line => Edge(line(0).toLong,line(1).toLong))
val graph = Graph(nodes,edges)
不幸的是,我收到此错误:
<console>:27: error: type mismatch;
found : org.apache.spark.rdd.RDD[Long]
required: org.apache.spark.rdd.RDD[(org.apache.spark.graphx.VertexId, ?)]
Error occurred in an application involving default arguments.
val graph = Graph(nodes,edges)
那么如何创建一个 VertexId 对象呢?据我了解,通过Long应该就足够了。
有任何想法吗?
非常感谢!
罗密欧