articleOct 8, 2013GOLD OA

Sparrow

University of California, Berkeley

Indexed incrossref

Abstract

Large-scale data analytics frameworks are shifting towards shorter task durations and larger degrees of parallelism to provide low latency. Scheduling highly parallel jobs that complete in hundreds of milliseconds poses a major challenge for task schedulers, which will need to schedule millions of tasks per second on appropriate machines while offering millisecond-level latency and high availability. We demonstrate that a decentralized, randomized sampling approach provides near-optimal performance while avoiding the throughput and availability limitations of a centralized design. We implement and deploy our scheduler, Sparrow, on a 110-machine cluster and demonstrate that Sparrow performs within 12% of an…

Citation impact

581
total citations
FWCI
127.01
Percentile
100%
References
27
Citations per year

Authors

4

Topics & keywords

Keywords
  • Computer science
  • Scheduling (production processes)
  • Latency (audio)
  • Distributed computing
  • Sparrow
  • Millisecond
  • Schedule
  • Parallel computing
UN Sustainable Development Goals
  • Industry, innovation and infrastructure
No related works found for this paper.

Funding