Thesis icon

Thesis

Automated data acquisition via Bayesian experimental design

Abstract:

Acquiring high-quality data is a major challenge in science and engineering. Data collection, whether through large-scale online surveys or carefully conducted laboratory experiments, is often costly, time-consuming, and constrained by limited resources. Consequently, designing optimal experiments to gather informative and valuable data is crucial for the efficient allocation of resources and, ultimately, for making better decisions.

Bayesian experimental design (BED) provides a principled mathematical framework for designing experiments to efficiently learn about a phenomenon of interest. This thesis focuses on information-theoretic BED, employing expected information gain (EIG) as the design criterion. Maximising the EIG ensures that the collected data is most informative about the underlying scientific question or hypothesis. This approach has the potential to significantly enhance the quality and efficiency of data acquisition, addressing the critical challenges of high costs, time constraints, and limited resources.

Despite its potential, the practical application of BED has historically been limited by significant computational challenges. The primary challenge lies in calculating the EIG, which is intractable for most realistic problems. Furthermore, adaptive experimental design, which leverages the information gained from previous experiments to guide the design of subsequent ones, adds an additional layer of computational complexity.

This thesis focuses on developing scalable and computationally efficient BED approaches to overcome these challenges. We introduce a novel class of methods called policy-based BED (PB-BED), which leverages deep learning to fully amortise the cost of adaptive experimental design, enabling real-time design decisions. Additionally, we introduce a semi-amortised approach that allows for periodic refinement of the design policy during the experiment itself, based on the actual data collected. This approach enhances the adaptability and robustness of the PB-BED framework, ensuring that the design strategy remains optimal as new information is gathered. Finally, we introduce a unified model-agnostic framework for designing large-scale contextual experiments using information-theoretic principles.

The contributions made in this thesis represent a step towards automated and reliable experimental design strategies that have the potential to accelerate scientific discovery and improve data-driven decision making across various domains.

Actions

Access Document

Files:

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Statistics
Sub department:
Statistics
Research group:
OxCSML
Oxford college:
St Peter's College
Role:
Author
ORCID:
0000-0002-0128-3773

Contributors

Institution:
University of Edinburgh
Role:
Contributor
Institution:
University of Oxford
Division:
MPLS
Department:
Statistics
Sub department:
Statistics
Oxford college:
University College
Role:
Contributor
Institution:
Microsoft
Role:
Contributor
Institution:
Microsoft
Role:
Contributor
Institution:
University of Oxford
Division:
MPLS
Department:
Statistics
Sub department:
Statistics
Role:
Contributor
ORCID:
0000-0002-2916-6878


More from this funder
Funder identifier:
https://ror.org/0439y7842
Funding agency for:
Ivanova, DR
Grant:
EP/S023151/1
Programme:
Modern Statistics and Statistical Machine Learning (StatML) CDT


DOI:
Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
University of Oxford


Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP