hadoop - hadoop 没有在多节点集群中运行

Question

我有一个自己制作的 jar 文件“Tsp.jar”。同样的 jar 文件在 hadoop 的单节点集群设置中执行良好。但是，当我在包含 2 台机器、一台笔记本电脑和台式机的集群上运行它时，当地图功能达到 50% 时，它会给我一个例外。这是输出

`hadoop@psycho-O:/usr/local/hadoop$ bin/hadoop jar Tsp.jar clust-Tsp_ip1 clust_Tsp_op4
11/04/27 16:13:06 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
11/04/27 16:13:06 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
11/04/27 16:13:06 INFO mapred.FileInputFormat: Total input paths to process : 1
11/04/27 16:13:06 INFO mapred.JobClient: Running job: job_201104271608_0001
11/04/27 16:13:07 INFO mapred.JobClient:  map 0% reduce 0%
11/04/27 16:13:17 INFO mapred.JobClient:  map 50% reduce 0%
11/04/27 16:13:20 INFO mapred.JobClient: Task Id : attempt_201104271608_0001_m_000001_0, Status : FAILED
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Tsp$TspReducer
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:841)
    at org.apache.hadoop.mapred.JobConf.getCombinerClass(JobConf.java:853)
    at org.apache.hadoop.mapred.Task$CombinerRunner.create(Task.java:1100)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:812)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:350)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Tsp$TspReducer
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:833)
    ... 6 more
Caused by: java.lang.ClassNotFoundException: Tsp$TspReducer
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:247)
    at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
    ... 7 more

11/04/27 16:13:20 WARN mapred.JobClient: Error reading task outputemil-desktop
11/04/27 16:13:20 WARN mapred.JobClient: Error reading task outputemil-desktop
^Z
[1]+  Stopped                 bin/hadoop jar Tsp.jar clust-Tsp_ip1 clust_Tsp_op4

hadoop@psycho-O:~$ jps
4937 Jps
3976 RunJar

` 此外，集群在执行 wordcount 示例时工作正常。所以我猜这是 Tsp.jar 文件的问题。

1）是否需要有一个jar文件才能在集群上运行？

2）在这里，我尝试在我制作的集群中运行一个 jar 文件。但是仍然会给出一个警告，即找不到 jar 文件。这是为什么？

3）运行jar文件时应该注意什么？除了我写的程序之外，它还必须包含什么？我的 jar 文件包含一个 Tsp.class、Tsp$TspReducer.class 和一个 Tsp$TspMapper.class。终端说，当 jar 文件中已经存在 Tsp$TspReducer 时，它找不到它。

谢谢

编辑

public class Tsp {
    public static void main(String[] args) throws IOException {
    JobConf conf = new JobConf(Tsp.class);
    conf.setJobName("Tsp");
    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(Text.class);
    conf.setMapperClass(TspMapper.class);
    conf.setCombinerClass(TspReducer.class);
    conf.setReducerClass(TspReducer.class); 
    FileInputFormat.addInputPath(conf,new Path(args[0]));
    FileOutputFormat.setOutputPath(conf,new Path(args[1]));
    JobClient.runJob(conf);
    }
    public static class TspMapper extends MapReduceBase
    implements Mapper<LongWritable, Text, Text, Text> {
    function findCost() {
    }
    public void map(LongWritable key,Text value, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
        find adjacency matrix from the input;
        for(int i = 0; ...) {
        .....
        output.collect(new Text(string1), new Text(string2));
        }
    }
    }    
    public static class TspReducer extends MapReduceBase implements Reducer<Text, Text, Text, Text> { 
    Text t1 = new Text();
    public void reduce(Text key, Iterator<Text> values, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
        String a;
            a = values.next().toString();
            output.collect(key,new Text(a));
    }
    }
}

score 7 · Accepted Answer

你目前有

conf.setJobName("Tsp");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(Text.class);
conf.setMapperClass(TspMapper.class);
conf.setCombinerClass(TspReducer.class);
conf.setReducerClass(TspReducer.class);

并且错误表明No job jar file set您没有设置罐子。

你需要类似的东西

conf.setJarByClass(Tsp.class);

从我所看到的来看，这应该可以解决这里看到的错误。

score 2 · Accepted Answer

我有完全相同的问题。这是我解决问题的方法（假设您的 map reduce 类称为 A）。创建工作调用后：
job.setJarByClass(A.class);

score 2 · Accepted Answer

11/04/27 16:13:06 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).

照他们说的做，在设置你的工作时，设置包含类的 jar。Hadoop 将 jar 复制到 DistributedCache（每个节点上的文件系统）并使用其中的类。

hadoop - hadoop 没有在多节点集群中运行

3 回答 3

Related

Reference