Delay scheduling
University of California, Berkeley · Meta (United States) · +1 more institution
Abstract
As organizations start to use data-intensive cluster computing systems like Hadoop and Dryad for more applications, there is a growing need to share clusters between users. However, there is a conflict between fairness in scheduling and data locality (placing tasks on nodes that contain their input data). We illustrate this problem through our experience designing a fair scheduler for a 600-node Hadoop cluster at Facebook. To address the conflict between locality and fairness, we propose a simple algorithm called delay scheduling: when the job that should be scheduled next according to fairness cannot launch a local task, it waits for a small amount of time, letting other jobs launch tasks instead. We find…
Citation impact
- FWCI
- 214.53
- Percentile
- 100%
- References
- 21
Authors
6Topics & keywords
- Computer science
- Locality
- Scheduling (production processes)
- Distributed computing
- Dynamic priority scheduling
- Gang scheduling
- Fair-share scheduling
- Round-robin scheduling