Large-scale matrix factorization with distributed stochastic gradient descent

Gemulla, Rainer; Nijkamp, Erik; Haas, Peter J.; Sismanis, Yannis

doi:10.1145/2020408.2020426

articleAug 21, 2011Closed access

Large-scale matrix factorization with distributed stochastic gradient descent

RGRainer Gemulla ENErik Nijkamp PJPeter J. Haas YSYannis Sismanis

Max Planck Institute for Informatics · IBM Research - Almaden

Indexed incrossref

Abstract

We provide a novel algorithm to approximately factor large matrices with millions of rows, millions of columns, and billions of nonzero elements. Our approach rests on stochastic gradient descent (SGD), an iterative stochastic optimization algorithm. We first develop a novel "stratified" SGD variant (SSGD) that applies to general loss-minimization problems in which the loss function can be expressed as a weighted sum of "stratum losses." We establish sufficient conditions for convergence of SSGD using results from stochastic approximation theory and regenerative process theory. We then specialize SSGD to obtain a new matrix-factorization algorithm, called DSGD, that can be fully distributed and run on…

Citation impact

618

total citations

FWCI: 51.44
Percentile: 100%
References: 46

Citations per year

Authors

4

Topics & keywords

Topics

Keywords

Stochastic gradient descent
Computer science
Scalability
Convergence (economics)
Mathematical optimization
Matrix decomposition
Matrix (chemical analysis)
Algorithm

No related works found for this paper.