Internet publication

Small steps and giant leaps: minimal Newton solvers for Deep Learning

Abstract:: We propose a fast second-order method that can be used as a drop-in replacement for current deep learning solvers. Compared to stochastic gradient descent (SGD), it only requires two additional forward-mode automatic differentiation operations per iteration, which has a computational cost comparable to two standard forward passes and is easy to implement. Our method addresses long-standing issues with current second-order solvers, which invert an approximate Hessian matrix every iteration exactly or by conjugate-gradient methods, a procedure that is both costly and sensitive to noise. Instead, we propose to keep a single estimate of the gradient projected by the inverse Hessian matrix, and update it once per iteration. This estimate has the same size and is similar to the momentum variable that is commonly used in SGD. No estimate of the Hessian is maintained. We first validate our method, called CurveBall, on small problems with known closed-form solutions (noisy Rosenbrock function and degenerate 2-layer linear networks), where current deep learning solvers seem to struggle. We then train several large models on CIFAR and ImageNet, including ResNet and VGG-f networks, where we demonstrate faster convergence with no hyperparameter tuning. Code is available.

Publication status:: Published

Peer review status:: Not peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Cite

Cite this record

APA Style

Henriques, J. F., Ehrhardt, S., Albanie, S., & Vedaldi, A. (2018). Small steps and giant leaps: minimal Newton solvers for Deep Learning.

MLA Style

Henriques, J. F., et al. Small Steps and Giant Leaps: Minimal Newton Solvers for Deep Learning. 2018.

Chicago Style

Henriques, JF, S Ehrhardt, S Albanie, and A Vedaldi. 2018. “Small Steps and Giant Leaps: Minimal Newton Solvers for Deep Learning.”
Share
Print

Access Document

Files:: Henriques_et_al_2018_Small_steps_and.pdf

(Preview, Version of record, pdf, 1.9MB, Terms of use)

Publisher copy:: 10.48550/arxiv.1805.08095

Authors

+ Henriques, JF More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Research group:: Visual Geometry Group
Role:: Author
ORCID:: 0000-0002-2478-2102

+ Ehrhardt, S More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Research group:: Visual Geometry Group
Role:: Author

+ Albanie, S More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Research group:: Visual Geometry Group
Role:: Author

+ Vedaldi, A More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Research group:: Visual Geometry Group
Oxford college:: New College
Role:: Author
ORCID:: 0000-0003-1374-2858

Host title:: arXiv
Publication date:: 2018-05-21
DOI:: 10.48550/arxiv.1805.08095

Language:: English
Keywords:: Networking and Information Technology R&D (NITRD)

4611 Machine Learning

Machine Learning and Artificial Intelligence

46 Information and Computing Sciences
Pubs id:: 1212147
Local pid:: pubs:1212147
Deposit date:: 2024-11-25

Terms of use

Copyright holder:: Henriques et al
Copyright date:: 2018
Rights statement:: ©2018 The Authors

Licence:: Other

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP