Large Scale Distributed Deep Networks

Dean, Jay B.; Corrado, Greg S.; Monga, Rajat; Chen, Kai; Devin, Matthieu; Mao, M.; Ranzato, Marc’Aurelio; Senior, Andrew; Tucker, Paul A.; Yang, Ke; Le, Quoc V.; Ng, Andrew Y.

articleDec 3, 2012Closed access

Large Scale Distributed Deep Networks

JBJay B. Dean GSGreg S. Corrado RMRajat Monga KCKai Chen MDMatthieu Devin

Abstract

Recent work in unsupervised feature learning and deep learning has shown that being able to train large models can dramatically improve performance. In this paper, we consider the problem of training a deep network with billions of parameters using tens of thousands of CPU cores. We have developed a software framework called DistBelief that can utilize computing clusters with thousands of machines to train large models. Within this framework, we have developed two algorithms for large-scale distributed training: (i) Downpour SGD, an asynchronous stochastic gradient descent procedure supporting a large number of model replicas, and (ii) Sandblaster, a framework that supports a variety of distributed batch…

Citation impact

2,914

total citations

FWCI: 112.63
Percentile: 100%
References: 33

Citations per year

Authors

12

Topics & keywords

Topics

Keywords

Computer science
Deep learning
Artificial intelligence
Asynchronous communication
Stochastic gradient descent
Deep neural networks
Artificial neural network
Machine learning

No related works found for this paper.