Thesis
Towards data-efficient deep learning with meta-learning and symmetries
- Abstract:
-
Recent advances in deep learning have been significantly propelled by the increasing availability of data and computational resources. While the abundance of data enables models to perform well in certain domains, there are real-world applications, such as in the medical field, where the data is scarce or difficult to collect. Furthermore, there are also scenarios where the large dataset is better viewed as lots of related small datasets, and the data becomes insufficient for the task associated with one of the small datasets. It is also noteworthy that human intelligence often requires only a handful of examples to perform well on new tasks, emphasizing the importance of designing data-efficient AI systems. This thesis delves into two strategies to address this challenge: metalearning and symmetries. Meta-learning approaches the data-rich environment as a collection of many small, individual datasets. Each of these small datasets represents a distinct task, yet there is underlying shared knowledge between them. Harnessing this shared knowledge allows for the design of learning algorithms that can efficiently address new tasks within similar domains. In comparison, symmetry is a form of direct prior knowledge. By ensuring that models’ predictions remain consistent despite any transformation to their inputs, these models enjoy better sample efficiency and generalization.
In the subsequent chapters, we present novel techniques and models which all aim at improving the data efficiency of deep learning systems. Firstly, we demonstrate the success of encoder-decoder style meta-learning methods based on Conditional Neural Processes (CNPs). Secondly, we introduce a new class of expressive metalearned stochastic process models which are constructed by stacking sequences of neural parameterised Markov transition operators in function space. Finally, we propose group equivariant subsampling/upsampling layers which tackles the loss of equivariance in conventional subsampling/upsampling layers. These layers can be used to construct end-to-end equivariant models with improved data-efficiency.
Actions
Authors
Contributors
- Institution:
- University of Oxford
- Division:
- MPLS
- Department:
- Statistics
- Role:
- Supervisor
- Institution:
- University of Oxford
- Division:
- MPLS
- Department:
- Statistics
- Role:
- Supervisor
- Role:
- Examiner
- Institution:
- University of Oxford
- Division:
- MPLS
- Department:
- Engineering Science
- Role:
- Examiner
- Funding agency for:
- Xu, J
- Programme:
- Oxford-Tencent Collaboration on Large Scale Machine Learning
- DOI:
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Subjects:
- Deposit date:
-
2024-10-08
Terms of use
- Copyright holder:
- Xu, J
- Copyright date:
- 2023
If you are the owner of this record, you can report an update to it here: Report update to this record