The Loss Surfaces of Multilayer Networks
Wroclaw Medical University · New York University
Abstract
We study the connection between the highly non-convex loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model under the assumptions of: i) variable independence, ii) redundancy in network parametrization, and iii) uniformity. These assumptions enable us to explain the complexity of the fully decoupled neural network through the prism of the results from random matrix theory. We show that for large-size decoupled networks the lowest critical values of the random loss function form a layered structure and they are located in a well-defined band lower-bounded by the global minimum. The number of local minima outside that band…
Citation impact
- FWCI
- —
- Percentile
- —
- References
- 19
Authors
5Topics & keywords
- Maxima and minima
- Mathematics
- Random graph
- Artificial neural network
- Overfitting
- Computer science
- Mathematical optimization
- Applied mathematics
- No poverty