Thesis
Automated data acquisition via Bayesian experimental design
- Abstract:
-
Acquiring high-quality data is a major challenge in science and engineering. Data collection, whether through large-scale online surveys or carefully conducted laboratory experiments, is often costly, time-consuming, and constrained by limited resources. Consequently, designing optimal experiments to gather informative and valuable data is crucial for the efficient allocation of resources and, ultimately, for making better decisions.
Bayesian experimental design (BED) provides a principled mathematical framework for designing experiments to efficiently learn about a phenomenon of interest. This thesis focuses on information-theoretic BED, employing expected information gain (EIG) as the design criterion. Maximising the EIG ensures that the collected data is most informative about the underlying scientific question or hypothesis. This approach has the potential to significantly enhance the quality and efficiency of data acquisition, addressing the critical challenges of high costs, time constraints, and limited resources.
Despite its potential, the practical application of BED has historically been limited by significant computational challenges. The primary challenge lies in calculating the EIG, which is intractable for most realistic problems. Furthermore, adaptive experimental design, which leverages the information gained from previous experiments to guide the design of subsequent ones, adds an additional layer of computational complexity.
This thesis focuses on developing scalable and computationally efficient BED approaches to overcome these challenges. We introduce a novel class of methods called policy-based BED (PB-BED), which leverages deep learning to fully amortise the cost of adaptive experimental design, enabling real-time design decisions. Additionally, we introduce a semi-amortised approach that allows for periodic refinement of the design policy during the experiment itself, based on the actual data collected. This approach enhances the adaptability and robustness of the PB-BED framework, ensuring that the design strategy remains optimal as new information is gathered. Finally, we introduce a unified model-agnostic framework for designing large-scale contextual experiments using information-theoretic principles.
The contributions made in this thesis represent a step towards automated and reliable experimental design strategies that have the potential to accelerate scientific discovery and improve data-driven decision making across various domains.
Actions
Access Document
- Files:
-
-
(Preview, Dissemination version, pdf, 5.2MB, Terms of use)
-
Authors
Contributors
- Institution:
- University of Edinburgh
- Role:
- Contributor
- Institution:
- University of Oxford
- Division:
- MPLS
- Department:
- Statistics
- Sub department:
- Statistics
- Oxford college:
- University College
- Role:
- Contributor
- Institution:
- Microsoft
- Role:
- Contributor
- Institution:
- Microsoft
- Role:
- Contributor
- Institution:
- University of Oxford
- Division:
- MPLS
- Department:
- Statistics
- Sub department:
- Statistics
- Role:
- Contributor
- ORCID:
- 0000-0002-2916-6878
- Funder identifier:
- https://ror.org/0439y7842
- Funding agency for:
- Ivanova, DR
- Grant:
- EP/S023151/1
- Programme:
- Modern Statistics and Statistical Machine Learning (StatML) CDT
- DOI:
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Subjects:
- Pubs id:
-
2051460
- Local pid:
-
pubs:2051460
- Deposit date:
-
2024-10-27
- ARK identifier:
Terms of use
- Copyright holder:
- Ivanova, DR
- Copyright date:
- 2024
If you are the owner of this record, you can report an update to it here: Report update to this record