Bridging Nonlinearities and Stochastic Regularizers with Gaussian Error Linear Units
Toyota Technological Institute at Chicago
Abstract
We propose the Gaussian Error Linear Unit (GELU), a high-performing neural network activation function. The GELU nonlinearity is the expected transformation of a stochastic regularizer which randomly applies the identity or zero map, combining the intuitions of dropout and zoneout while respecting neuron values. This connection suggests a new probabilistic understanding of nonlinearities. We perform an empirical evaluation of the GELU nonlinearity against the ReLU and ELU activations and find performance improvements across all tasks.
Citation impact
755
total citations
- FWCI
- —
- Percentile
- —
- References
- 24
Citations per year
Authors
2Topics & keywords
Topics
Keywords
- Dropout (neural networks)
- Gaussian
- Probabilistic logic
- Bridging (networking)
- Nonlinear system
- Computer science
- Transformation (genetics)
- Artificial neural network
No related works found for this paper.