Thesis
Advances in meta-reinforcement learning and imitation learning
- Abstract:
-
Recent advances in machine learning (ML) have led to remarkable progress in artificial intelligence (AI), enabling computers to solve previously intractable problems. While breakthroughs in generative AI, such as large language models and diffusion-based image generators, have captured public attention, the development of interactive AI agents remains an emerging frontier. Unlike generative models, interactive agents are designed to learn from their interactions with dynamic environments, enabling applications such as robotics, autonomous driving, and adaptive control. This thesis explores two critical aspects of learning for interactive AI: meta-reinforcement learning (meta-RL) and imitation learning (IL).
In the first part, we address the problem of improving the sample efficiency of reinforcement learning (RL) algorithms through meta-learning. By developing novel meta-RL methods, we enable agents to adaptively learn how to learn, enhancing their ability to assign credit and optimize behavior efficiently. We introduce a meta-gradient-based approach to adaptively assign credit over time and present theoretical and empirical insights into the challenges of meta-gradient estimation. The second part of the thesis focuses on IL in settings where the expert and imitator have different observations of the environment, leading to what is known as the “imitation gap”. We propose algorithms to tackle this gap, modeling the missing information as confounding latent variables or Bayesian priors, and show that these methods enable effective imitation in complex scenarios.
Overall, this thesis contributes to the development of interactive learning agents through advances in meta-RL and IL, providing algorithms, empirical analysis, and theoretical insights. Our work not only advances the efficiency and adaptability of learning agents, but also lays the groundwork for future research to build more robust, safe, and generalizable AI systems capable of interacting effectively with complex environments.
Actions
Access Document
- Files:
-
-
(Preview, Dissemination version, pdf, 14.7MB, Terms of use)
-
Authors
Contributors
+ Beck, J
- Role:
- Contributor
+ Liu, E
- Role:
- Contributor
+ Xiong, Z
- Role:
- Contributor
+ Zintgraf, L
- Role:
- Contributor
+ Finn, C
- Role:
- Contributor
+ Engineering and Physical Sciences Research Council
More from this funder
- Funder identifier:
- https://ror.org/0439y7842
- Grant:
- CS2020 _ EPSRC/CS_ 1189713
- Programme:
- Doctoral Training Partnership Scholarship/Department of Computer Science Scholarship
- DOI:
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Deposit date:
-
2025-12-21
- ARK identifier:
Terms of use
- Copyright holder:
- Risto Vuorio
- Copyright date:
- 2024
If you are the owner of this record, you can report an update to it here: Report update to this record