Hadoop

Enhance Hadoop MapReduce Speed for small jobs

Introduction There are some circumstances when input of Hadoop’s MapReduce is relatively small. Consequently the overhead of allocating and running tasks in new containers outweighs the gain to be had in running them in parallel, compared to running them sequentially on one node. Such a job is said to be uberized, or run as an uber task. Enable uber optimization To enable uberized job, simply set mapreduce.job.ubertask.enable to true. But that is not sufficient.