PipeDream
Microsoft Research (India) · Stanford University · +2 more institutions
Abstract
DNN training is extremely time-consuming, necessitating efficient multi-accelerator parallelization. Current approaches to parallelizing training primarily use intra-batch parallelization, where a single iteration of training is split over the available workers, but suffer from diminishing returns at higher worker counts. We present PipeDream, a system that adds inter-batch pipelining to intra-batch parallelism to further improve parallel training throughput, helping to better overlap computation with communication and reduce the amount of communication when possible. Unlike traditional pipelining, DNN training is bi-directional, where a forward pass through the computation graph is followed by a backward pass…
Citation impact
- FWCI
- 40.04
- Percentile
- 100%
- References
- 40
Authors
8- DNDeepak NarayananCorresponding
Microsoft Research (India), Stanford University
- AHAaron Harlap
Microsoft Research (India), Carnegie Mellon University
- APAmar Phanishayee
Microsoft Research (United Kingdom)
- VSVivek Seshadri
Microsoft Research (United Kingdom)
- NRNikhil R. Devanur
Microsoft Research (United Kingdom)
Topics & keywords
- Computer science
- Pipeline (software)
- Parallel computing
- Computation
- Throughput
- Parallelism (grammar)
- Algorithm