Thesis
Generative models: theory and applications
- Abstract:
-
Given some samples from a data distribution, the central aim of generative modeling is to generate more samples from approximately the same distribution. This framework has recently seen an explosion in popularity, with impressive applications in image generation, language modeling and protein synthesis. The striking success of these methods motivates two key questions: under what conditions do generative models provide accurate approximations to the underlying data distribution, and can we extend the range of scenarios in which they may be applied? This thesis considers these questions in the context of two classes of generative models: diffusion models and importance weighted autoencoders.
Diffusion models work by iteratively applying noise to the data distribution and then learning to remove this noise. They were originally introduced for real-valued data. However, for many potential applications our data is most naturally defined on another state space - perhaps a manifold, or a discrete space. We describe extensions of diffusion models to arbitrary state spaces, using generic Markov processes for noising, and show how such models can be effectively learned. We also provide a detailed study of a specific extension to discrete state spaces. Next, we investigate the approximation accuracy of diffusion models. We derive error bounds for flow matching - a generalization of diffusion models - and improve upon state-of-the-art bounds for diffusion models, using techniques inspired by stochastic localization.
Importance weighted autoencoders (IWAEs) work by learning a latent variable representation of the data, using importance sampling in the evidence lower bound to get a tighter variational objective. IWAEs suffer from several limitations, including posterior variance underestimation, poor signal-to-noise ratio during training, and weight collapse in the importance sampling ratios. We propose an extension of the IWAE - the VR-IWAE - that addresses the first two of these three issues. We then provide a detailed theoretical study of the third, showing that it persists even for the VR-IWAE. We provide empirical demonstrations of these phenomena on a range of simulated and real-world data.
Actions
Authors
Contributors
- Role:
- Supervisor
- ORCID:
- 0000-0002-7662-419X
- Role:
- Supervisor
- ORCID:
- 0000-0002-0821-4607
- Grant:
- EP/S023151/1
- Programme:
- CDT in Modern Statistics and Statistical Machine Learning
- DOI:
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Subjects:
- Deposit date:
-
2024-03-29
If you are the owner of this record, you can report an update to it here: Report update to this record