Thesis icon

Thesis

Structure and uncertainty in deep learning

Abstract:

Designing uncertainty-aware deep learning models which are able to provide reasonable uncertainties along with their predictions has long been a goal for parts of the machine learning community. Such models are also frequently desired by practitioners. The most widespread and obvious method to provide this to this is to take existing deep architectures and attempt to apply existing Bayesian techniques to them, for instance, by treating the weights of the neural network as random variables in a Bayesian framework. This thesis attempts to answer the question: are existing neural network architectures the best way to get reasonable uncertainty? In the first part of this thesis, we present research on the uncertainty behaviour of Bayesian neural networks in an adversarial setting, which demonstrates that, while a Bayesian approach improves significantly on deterministic networks near the data distribution, the extrapolation behaviour is undesirable, as standard neural network architectures have a structural bias toward confident extrapolation. Motivated by this, we then explore two alternatives to standard deep learning architectures which attempt to address this issue. First, we describe a novel generative formulation of capsule networks, which attempt to impose structure on a learning task by making strong assumptions about the structure of scenes. We then use this generative model to examine whether these underlying assumptions are useful, arguing that they in fact have significant flaws. Second, we explore bilipschitz models, a family of architectures which address the more limited goal of ensuring prior reversion in deep neural networks. These are based on deep kernel learning, attempting to control the behaviour of neural networks out of distribution by using final classification layers which revert to a prior as the distance to a set of support vectors increases. To maintain this property while using a neural feature extractor, we describe a novel 'bilipschitz' regularisation scheme for these models, based on preventing feature collapse by imposing a constraint motivated by work on invertible networks. We describe various useful applications of these models, and analyse why this regularisation scheme still appears to be effective even when the original motivation behind it no longer holds, in particular, where the feature dimensionality is lower than the input.

Actions


Access Document


Files:

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Computer Science
Research group:
OATML
Oxford college:
Kellogg College
Role:
Author
ORCID:
0000-0001-6632-8162

Contributors

Institution:
University of Oxford
Division:
MPLS
Department:
Computer Science
Research group:
OATML Group
Role:
Supervisor
ORCID:
0000-0002-2733-2078
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Examiner
Institution:
University of Toronto
Role:
Examiner


More from this funder
Funder identifier:
http://dx.doi.org/10.13039/501100000266
Funding agency for:
Smith, L
Grant:
EP/L015897/1


Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
University of Oxford

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP