当前位置:  首页>> 技术小册>> Hadoop入门教程

主要是读取InputSplit的每一个Key,Value对并进行处理

  1. public class Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT> {
  2. /**
  3. * 预处理,仅在map task启动时运行一次
  4. */
  5. protected void setup(Context context) throws IOException, InterruptedException {
  6. }
  7. /**
  8. * 对于InputSplit中的每一对<key, value>都会运行一次
  9. */
  10. @SuppressWarnings("unchecked")
  11. protected void map(KEYIN key, VALUEIN value, Context context) throws IOException, InterruptedException {
  12. context.write((KEYOUT) key, (VALUEOUT) value);
  13. }
  14. /**
  15. * 扫尾工作,比如关闭流等
  16. */
  17. protected void cleanup(Context context) throws IOException, InterruptedException {
  18. }
  19. /**
  20. * map task的驱动器
  21. */
  22. public void run(Context context) throws IOException, InterruptedException {
  23. setup(context);
  24. while (context.nextKeyValue()) {
  25. map(context.getCurrentKey(), context.getCurrentValue(), context);
  26. }
  27. cleanup(context);
  28. }
  29. }
  30. public class MapContext<KEYIN, VALUEIN, KEYOUT, VALUEOUT> extends TaskInputOutputContext<KEYIN, VALUEIN, KEYOUT, VALUEOUT> {
  31. private RecordReader<KEYIN, VALUEIN> reader;
  32. private InputSplit split;
  33. /**
  34. * Get the input split for this map.
  35. */
  36. public InputSplit getInputSplit() {
  37. return split;
  38. }
  39. @Override
  40. public KEYIN getCurrentKey() throws IOException, InterruptedException {
  41. return reader.getCurrentKey();
  42. }
  43. @Override
  44. public VALUEIN getCurrentValue() throws IOException, InterruptedException {
  45. return reader.getCurrentValue();
  46. }
  47. @Override
  48. public boolean nextKeyValue() throws IOException, InterruptedException {
  49. return reader.nextKeyValue();
  50. }
  51. }

该分类下的相关小册推荐:

暂无相关推荐.