A Preemptive Fair Scheduler Policy for Disco MapReduce Framework
Disco is an open source MapReduce framework and an alternative to Hadoop. Preemption of tasks is an important feature which helps organizations relying on the MapReduce paradigm to handle their heterogeneous workload usually constituted of research (long duration and with low priority) and production (short duration and with high priority) applications. The missing preemption in Disco affects the production jobs when these two kinds of jobs need to be executed in parallel: the high priority response is delayed because there aren’t resources to compute it. In this paper we describe the implementation of the Preemptive Fair Scheduler Policy which improved largely our experimental production job execution time with a small impact on the research job.