Thesis
Spectral machine learning
- Abstract:
- In this thesis we develop a spectral approach to large kernel matrices, graphs and the Hessians of neural networks. This approach paves the way for efficient and improved inference schemes, novel algorithms and theoretical results. The first contribution of the thesis is the development of a novel Maximum Entropy algorithm applied to the problems of log determinant estimation (necessitated by Bayesian model selection), cluster counting and graph similarity. The method produces strong empirical results compared to existing established methods, such as the Lanczos algorithm. The second contribution is the development of a fielddependent, spiked random matrix theory for characterising the sub-sampling effects, due to mini-batching, on the loss surface of deep neural networks. The theory we develop predicts that, up to a threshold, non-adaptive-gradient methods should scale linearly with batch size and adaptive methods should scale with the squareroot of the batch size. We verify these results empirically. As final contributions we develop an open source software package for neural network curvature computations, show that spectral sharpness is not relevant for solution generalisation and discuss extensions to previous theoretical work. Based upon these theoretical contributions, we develop an adaptive algorithm (Gadam) which posts strong test set performance across a variety of image and text classification problems, regularly beating finely tuned non-adaptive methods.
Actions
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
- 
                    English
- Keywords:
- Subjects:
- Deposit date:
- 
                    2021-07-07
Terms of use
- Copyright holder:
- Granziol, D
- Copyright date:
- 2020
If you are the owner of this record, you can report an update to it here: Report update to this record