Journal article icon

Journal article

Is SGD a Bayesian sampler? Well, almost

Abstract:

Deep neural networks (DNNs) generalise remarkably well in the overparameterised regime, suggesting a strong inductive bias towards functions with low generalisation error. We empirically investigate this bias by calculating, for a range of architectures and datasets, the probability PSGD(f∣S) that an overparameterised DNN, trained with stochastic gradient descent (SGD) or one of its variants, converges on a function f consistent with a training set S. We also use Gaussian processes to estimat...

Expand abstract
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Files:
Publication website:
https://jmlr.org/papers/v22/20-676.html

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Physics
Sub department:
Theoretical Physics
Oxford college:
Worcester College
Role:
Author
ORCID:
0000-0002-8438-910X
Publisher:
Journal of Machine Learning Research
Journal:
Journal of Machine Learning Research More from this journal
Volume:
22
Article number:
79
Pages:
1-64
Publication date:
2021-02-15
Acceptance date:
2020-10-15
EISSN:
1533-7928
ISSN:
1532-4435
Language:
English
Keywords:
Pubs id:
1179937
Local pid:
pubs:1179937
Deposit date:
2021-11-24

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP