Thesis
Uncertainty Estimation: single forward pass methods and applications in Active Learning
- Abstract:
- Machine Learning (ML) models are now powerful enough to be used in complex automated decision-making settings such as autonomous driving and medical diagnosis. Despite being very accurate in general, these models do still make mistakes. A critical factor in being able to depend on such models is that they can quantify the uncertainty of their predictions, and it is paramount that this is taken into account by users of the model. Unfortunately, deep learning models cannot readily express their uncertainty, rendering them unsafe for many real-world applications. Bayesian modelling provides a mathematical framework for learning models that can express their uncertainty. However, exact Bayesian methods are computationally expensive to learn and evaluate, and approximate methods often reduce accuracy or are still prohibitively expensive. Meanwhile, ML models continue to increase in number of parameters, meaning that one has to make a decision between being (more) Bayesian or using a larger model. So far it has always fallen in favour of larger models. Instead of building on Bayesian methods, we deconstruct uncertainty estimation and formulate desiderata that we base our work on throughout the thesis (Chapter 1). In Chapter 3, we introduce a new model (DUQ) that is able to estimate uncertainty in a single forward pass by carefully constructing the model’s parameter and output space based on the desiderata. We then extend this model in Chapter 4 (DUE) by placing it in the framework provided by Deep Kernel Learning. This enables the model to work well for both classification and regression tasks (as opposed to just classification), and estimate uncertainty over a batch of inputs jointly. Both models are competitive with standard softmax models in terms of accuracy and speed, while having significantly improved uncertainty estimation. We additionally consider the problem of Active Learning (AL), where the goal is to maximise label efficiency by selecting only the most informative data points to be labelled. In Section 4.5, we evaluate the DUE model in AL for personalised healthcare. Here, the labelled dataset needs to adhere to specific assumptions made in causal inference, which makes this a challenging problem. In Chapter 5, we look at AL in the batch setting. We show that current methods do not select diverse batches of data, and we introduce a principled method to overcome this issue. Building upon deep kernel learning, this thesis provides a compelling foundation for single forward pass uncertainty and advances the state of the art in active learning. In the conclusions (Section 6, and at the end of each chapter), we discuss how users of ML models could make use of these tools for making sound and confident decisions.
Actions
Authors
Contributors
+ Gal, Y
- Role:
- Supervisor
+ Teh, YW
- Role:
- Supervisor
+ Baydin, AG
- Role:
- Examiner
+ Wilson, AG
- Role:
- Examiner
+ Engineering and Physical Sciences Research Council
More from this funder
- Funder identifier:
- http://dx.doi.org/10.13039/501100000266
- Grant:
- EP/N509711/1
- Programme:
- EPSRC Doctoral Training Partnership
+ DeepMind
More from this funder
- Funder identifier:
- http://dx.doi.org/10.13039/100017149
- Programme:
- Oxford-DeepMind Graduate Scholarship
- DOI:
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Subjects:
- Deposit date:
-
2023-08-13
Terms of use
- Copyright holder:
- Joost René van Amersfoort
- Copyright date:
- 2022
If you are the owner of this record, you can report an update to it here: Report update to this record