Thesis icon

Thesis

Advances in meta-reinforcement learning and imitation learning

Abstract:
Recent advances in machine learning (ML) have led to remarkable progress in artificial intelligence (AI), enabling computers to solve previously intractable problems. While breakthroughs in generative AI, such as large language models and diffusion-based image generators, have captured public attention, the development of interactive AI agents remains an emerging frontier. Unlike generative models, interactive agents are designed to learn from their interactions with dynamic environments, enabling applications such as robotics, autonomous driving, and adaptive control. This thesis explores two critical aspects of learning for interactive AI: meta-reinforcement learning (meta-RL) and imitation learning (IL).

In the first part, we address the problem of improving the sample efficiency of reinforcement learning (RL) algorithms through meta-learning. By developing novel meta-RL methods, we enable agents to adaptively learn how to learn, enhancing their ability to assign credit and optimize behavior efficiently. We introduce a meta-gradient-based approach to adaptively assign credit over time and present theoretical and empirical insights into the challenges of meta-gradient estimation. The second part of the thesis focuses on IL in settings where the expert and imitator have different observations of the environment, leading to what is known as the “imitation gap”. We propose algorithms to tackle this gap, modeling the missing information as confounding latent variables or Bayesian priors, and show that these methods enable effective imitation in complex scenarios.

Overall, this thesis contributes to the development of interactive learning agents through advances in meta-RL and IL, providing algorithms, empirical analysis, and theoretical insights. Our work not only advances the efficiency and adaptability of learning agents, but also lays the groundwork for future research to build more robust, safe, and generalizable AI systems capable of interacting effectively with complex environments.

Actions

Access Document

Files:

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Computer Science
Role:
Author

Contributors

Role:
Contributor
Role:
Contributor
Role:
Contributor
Role:
Contributor
Role:
Contributor


More from this funder
Funder identifier:
https://ror.org/0439y7842
Grant:
CS2020 _ EPSRC/CS_ 1189713
Programme:
Doctoral Training Partnership Scholarship/Department of Computer Science Scholarship


DOI:
Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
University of Oxford


Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP