The Loss Surfaces of Multilayer Networks

Choromanska, Anna; Henaff, Mikael; Mathieu, Michaël; Arous, Gérard Ben; LeCun, Yann

doi:10.48550/arxiv.1412.0233

articlearXiv (Cornell University)Nov 30, 2014GREEN OA

The Loss Surfaces of Multilayer Networks

ACAnna Choromanska MHMikael Henaff MMMichaël Mathieu GBGérard Ben Arous YLYann LeCun

Wroclaw Medical University · New York University

Indexed inarxivdatacite

Abstract

We study the connection between the highly non-convex loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model under the assumptions of: i) variable independence, ii) redundancy in network parametrization, and iii) uniformity. These assumptions enable us to explain the complexity of the fully decoupled neural network through the prism of the results from random matrix theory. We show that for large-size decoupled networks the lowest critical values of the random loss function form a layered structure and they are located in a well-defined band lower-bounded by the global minimum. The number of local minima outside that band…

Citation impact

718

total citations

FWCI: —
Percentile: —
References: 19

Citations per year

Authors

5

Topics & keywords

Topics

Keywords

Maxima and minima
Mathematics
Random graph
Artificial neural network
Overfitting
Computer science
Mathematical optimization
Applied mathematics

UN Sustainable Development Goals

No poverty

No related works found for this paper.