articleSep 6, 2015Closed access

Scalable distributed DNN training using commodity GPU cloud computing

Indexed incrossref

Abstract

We introduce a new method for scaling up distributed Stochastic Gradient Descent (SGD) training of Deep Neural Networks (DNN). The method solves the well-known communication bottleneck problem that arises for data-parallel SGD because compute nodes frequently need to synchronize a replica of the model. We solve it by purposefully controlling the rate of weight-update per individual weight, which is in contrast to the uniform update-rate customarily imposed by the size of a mini-batch. It is shown empirically that the method can reduce the amount of communication by three orders of magnitude while training a typical DNN for acoustic modelling. This reduction in communication bandwidth enables efficient scaling…

Citation impact

547
total citations
FWCI
28.50
Percentile
100%
References
23
Citations per year

Authors

1

Topics & keywords

Keywords
  • Computer science
  • Cloud computing
  • Scalability
  • Commodity
  • Training (meteorology)
  • Distributed computing
  • Parallel computing
  • Operating system
UN Sustainable Development Goals
  • Industry, innovation and infrastructure
No related works found for this paper.