Stochastic Modified Equations and Dynamics of Stochastic Gradient Algorithms I: Mathematical Foundations
Journal Article - Open Access
Institute of High Performance Computing Agency for Science, Technology and Research Connexis North Singapore
Pagination or Media Count:
We develop the mathematical foundations of the stochastic modified equations SME framework for analyzing the dynamics of stochastic gradient algorithms, where the latter is approximated by a class of stochastic differential equations with small noise parameters. We prove that this approximation can be understood mathematically as an weak approximation, which leads to a number of precise and useful results on the approximations of stochastic gradient descent SGD, momentum SGD and stochastic Nesterovs accelerated gradient method in the general setting of stochastic objectives. We also demonstrate through explicit calculations that this continuous-time approach can uncover importantanalytical insights into the stochastic gradient algorithms under consideration that may not be easy to obtain in a purely discrete-time setting.
- Statistics and Probability
- Theoretical Mathematics
- Numerical Mathematics