这是我收到的错误:
14/02/28 02:52:43 INFO mapred.JobClient: Task Id : attempt_201402271927_0020_m_000001_2, Status : FAILED java.lang.NullPointerException at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:843) at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:376) at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:85) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.(MapTask.java:584) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:656) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapred.Child.main(Child.java:262)
我已经注释了我的代码,基本上采用了典型的LongWritable和Text,然后我只输出一个常量IntWritable 1和一个空的天气类(自定义类):
这是我的mapper类:
public class Map extends Mapper{ private IntWritable id = new IntWritable(1); private Weather we = new Weather(); public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { //String s; //String line = value.toString(); //int start[] = {0,18,31,42,53,64,74,84,88,103}; //int end[] = {6,22,33,44,55,66,76,86,93,108}; //if(line.length() > 108) { // create the object to hold our data // getStuff() // parse the string // push the object onto our data structure context.write(id, we); //} }
这是我的减速机:
public class Reduce extends Reducer{ private Text text = new Text("one"); private IntWritable one = new IntWritable(1); public void reduce(IntWritable key, Iterable weather, Context context) throws IOException, InterruptedException { //for(Weather w : weather) { // text.set(w.toString()); context.write(one, text); } }
这是我的主要内容:
public class Skyline { public static void main(String[] args) throws IOException{ //String s = args[0].length() > 0 ? args[0] : "skyline.in"; Path input, output; Configuration conf = new Configuration(); conf.set("io.serializations", "org.apache.hadoop.io.serializer.JavaSerialization," + "org.apache.hadoop.io.serializer.WritableSerialization"); try { input = new Path(args[0]); } catch(ArrayIndexOutOfBoundsException e) { input = new Path("hdfs://localhost/user/cloudera/in/skyline.in"); } try { output = new Path(args[1]); //FileSystem.getLocal(conf).delete(output, true); } catch(ArrayIndexOutOfBoundsException e) { output = new Path("hdfs://localhost/user/cloudera/out/"); //FileSystem.getLocal(conf).delete(output, true); } Job job = new Job(conf, "skyline"); job.setJarByClass(Skyline.class); job.setOutputKeyClass(IntWritable.class); job.setOutputValueClass(Weather.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.addInputPath(job, input); FileOutputFormat.setOutputPath(job, output); try { job.waitForCompletion(true); } catch(InterruptedException e) { System.out.println("Interrupted Exception"); } catch(ClassNotFoundException e) { System.out.println("ClassNotFoundException"); } } }
这是我的Weather类的示例:
public class Weather { private in stationId; public Weather(){} public int getStation(){return this.stationID;} public void setStation(int r){this.stationID = r} //...24 additional things of ints, doubles and strings }
我的智慧结束了.在这一点上,我有一个程序的shell,什么都不做,仍然收到错误.我已经阅读了Java Generics,以确保我正确使用它们(我认为我是),我对MapReduce范例非常环保,但这个程序只是一个shell,从MapReduce教程(https)修改://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html#Walk-through).