Thesis
Practical Bayesian optimisation for hyperparameter tuning
- Abstract:
-
Advances in machine learning have had, and continue to have, a profound effect on scientific research and industrial activities. We are able to uncover insights contained within large troves of data and develop models to solve problems that seemed infeasible until recently. Before we can train a model, we have to define high-level properties, also known as the model’s hyperparameters. Examples include the architecture of a neural network, the class and parameters of a stochastic optimisation algorithm, the number of trees in a random forest or the kernel in a Gaussian process. One of the key challenges in developing good machine learning models is choosing good values for these hyperparameters.
Evaluating the quality of a set of hyperparameters is usually computationally costly, as it requires us to first train a model specified by the hyperparameters before we can evaluate a score. The computational costs of scoring a configuration means that we want to perform this hyperparameter search with as few trials as possible. Bayesian optimisation (BO) is a general-purpose efficient global optimisation technique that has proven to be remarkably effective for many expensive optimisation tasks, including hyperparameter tuning. This thesis brings together multiple contributions aimed at making BO more practically applicable, focussing on uncertain-input measurements as well as model hyperparameter tuning.
The case of data with input noise is relevant when we are modelling (or finding extremal values of) a quantity defined on a spatial grid. We undertake a comparative study evaluating existing uncertain-input GP methods, and develop a novel approach based on the unscented transform that outperforms existing uncertain-input methods.
We then address the question of parallelising BO to make optimal use of available computing resources. We develop a novel class of methods for asynchronous parallel BO, and demonstrate empirically the benefits of our approach, as well as more generally showing the advantages of asynchronous over synchronous BO.
Finally, we propose a new framework for dealing with a mixture of categorical and continuous hyperparameters. With few exceptions, model hyperparameters consist of both categorical and continuous variables, but BO approaches have mainly focussed on continuous search spaces. We present a novel framework that combines multi-armed bandits with continuous BO methods and leverages each of their strengths to achieve state-of-the-art performance on real-world optimisation tasks.
Actions
- Grant:
- 1136085
- Programme:
- The possible uses of computers to analyse data of complex systems
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Subjects:
- Deposit date:
-
2020-08-21
Terms of use
- Copyright holder:
- Alvi, A
- Copyright date:
- 2019
If you are the owner of this record, you can report an update to it here: Report update to this record