Scalable distributed DNN training using commodity GPU cloud computing

Ström, Nikko

doi:10.21437/interspeech.2015-354

articleSep 6, 2015Closed access

Scalable distributed DNN training using commodity GPU cloud computing

NSNikko Ström

Indexed incrossref

Abstract

We introduce a new method for scaling up distributed Stochastic Gradient Descent (SGD) training of Deep Neural Networks (DNN). The method solves the well-known communication bottleneck problem that arises for data-parallel SGD because compute nodes frequently need to synchronize a replica of the model. We solve it by purposefully controlling the rate of weight-update per individual weight, which is in contrast to the uniform update-rate customarily imposed by the size of a mini-batch. It is shown empirically that the method can reduce the amount of communication by three orders of magnitude while training a typical DNN for acoustic modelling. This reduction in communication bandwidth enables efficient scaling…

Citation impact

547

total citations

FWCI: 28.50
Percentile: 100%
References: 23

Citations per year

Authors

1

NS
Nikko StrömCorresponding

Topics & keywords

Topics

Keywords

Computer science
Cloud computing
Scalability
Commodity
Training (meteorology)
Distributed computing
Parallel computing
Operating system

UN Sustainable Development Goals

Industry, innovation and infrastructure

No related works found for this paper.