?另外一个hadoop的入门demo,求平均数。是对WordCount这个demo的一个小小的修改。输入一堆成绩单(人名,成绩),然后求每个人成绩平均数,比如://?subject1.txt?a90?b80?c70?//subject2.txt?a100?b90?c80?求a,b,c这三个人的平均
?另外一个hadoop的入门demo,求平均数。是对WordCount这个demo的一个小小的修改。输入一堆成绩单(人名,成绩),然后求每个人成绩平均数,比如:
//?subject1.txt
?a90
?b80
?c70
?//subject2.txt
?a100
?b90
?c80
?求a,b,c这三个人的平均分。解决思路很简单,在map阶段key是名字,value是成绩,直接output。reduce阶段得到了map输出的key名字,values是该名字对应的一系列的成绩,那么对其求平均数即可。
?这里我们实现了两个版本的代码,分别用TextInputFormat和KeyValueTextInputFormat来作为输入格式。
?TextInputFormat版本:
?
importjava.util.;importjava.io.;importorg.apache.hadoop.conf.Configuration;importorg.apache.hadoop.fs.Path;importorg.apache.hadoop.io.Text;importorg.apache.hadoop.io.IntWritable;importorg.apache.hadoop.mapreduce.Mapper;importorg.apache.hadoop.mapreduce.Reducer;importorg.apache.hadoop.mapreduce.Job;importorg.apache.hadoop.mapreduce.lib.input.FileInputFormat;importorg.apache.hadoop.mapreduce.lib.output.FileOutputFormat;publicclassAveScore{publicstaticclassAveMapperextendsMapper{@Overridepublicvoidmap(Objectkey,Textvalue,Contextcontext)throwsIOException,InterruptedException{Stringline=value.toString();String[]strs=line.split("");Stringname=strs[0];intscore=Integer.parseInt(strs[1]);context.write(newText(name),newIntWritable(score));}}publicstaticclassAveReducerextendsReducer{@Overridepublicvoidreduce(Textkey,Iterablevalues,Contextcontext)throwsIOException,InterruptedException{intsum=0;intcount=0;for(IntWritableval:values){sum+=val.get();count++;}intaveScore=sum/count;context.write(key,newIntWritable(aveScore));}}publicstaticvoidmain(String[]args)throwsException{ConfigurationcOnf=newConfiguration();Jobjob=newJob(conf,"AverageScore");job.setJarByClass(AveScore.class);job.setMapperClass(AveMapper.class);job.setReducerClass(AveReducer.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(IntWritable.class);FileInputFormat.addInputPath(job,newPath(args[0]));FileOutputFormat.setOutputPath(job,newPath(args[1]));System.exit(job.waitForCompletion(true)?0:1);}}
KeyValueTextInputFormat版本;
importjava.util.;importjava.io.;importorg.apache.hadoop.conf.Configuration;importorg.apache.hadoop.fs.Path;importorg.apache.hadoop.io.Text;importorg.apache.hadoop.io.IntWritable;importorg.apache.hadoop.mapreduce.Mapper;importorg.apache.hadoop.mapreduce.Reducer;importorg.apache.hadoop.mapreduce.Job;importorg.apache.hadoop.mapreduce.lib.input.FileInputFormat;importorg.apache.hadoop.mapreduce.lib.output.FileOutputFormat;importorg.apache.hadoop.mapreduce.lib.input.KeyValueTextInputFormat;importorg.apache.hadoop.mapreduce.lib.output.TextOutputFormat;publicclassAveScore_KeyValue{publicstaticclassAveMapperextendsMapper{@Overridepublicvoidmap(Textkey,Textvalue,Contextcontext)throwsIOException,InterruptedException{intscore=Integer.parseInt(value.toString());context.write(key,newIntWritable(score));}}publicstaticclassAveReducerextendsReducer{@Overridepublicvoidreduce(Textkey,Iterablevalues,Contextcontext)throwsIOException,InterruptedException{intsum=0;intcount=0;for(IntWritableval:values){sum+=val.get();count++;}intaveScore=sum/count;context.write(key,newIntWritable(aveScore));}}publicstaticvoidmain(String[]args)throwsException{ConfigurationcOnf=newConfiguration();conf.set("mapreduce.input.keyvaluelinerecordreader.key.value.separator","");Jobjob=newJob(conf,"AverageScore");job.setJarByClass(AveScore_KeyValue.class);job.setMapperClass(AveMapper.class);job.setReducerClass(AveReducer.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(IntWritable.class);job.setInputFormatClass(KeyValueTextInputFormat.class);job.setOutputFormatClass(TextOutputFormat.class);FileInputFormat.addInputPath(job,newPath(args[0]));FileOutputFormat.setOutputPath(job,newPath(args[1]));System.exit(job.waitForCompletion(true)?0:1);}}
输出结果为:
?a95
?b85
?c75
?
作者:qiul12345发表于2013-8-2321:51:03原文链接
阅读:113评论:0查看评论
原文地址:HadoopHelloWordExamples-求平均数,感谢原作者分享。