Thesis icon

Thesis

Reinforcement learning in large state action spaces

Abstract:

Reinforcement learning (RL) is a promising framework for training intelligent agents which learn to optimize long term utility by directly interacting with the environment. Creating RL methods which scale to large state-action spaces is a critical problem towards ensuring real world deployment of RL systems. However, several challenges limit the applicability of RL to large scale settings. These include difficulties with exploration, low sample efficiency, computational intractability, task constraints like decentralization and lack of guarantees about important properties like performance, generalization and robustness in potentially unseen scenarios.

This thesis is motivated towards bridging the aforementioned gap. We propose several principled algorithms and frameworks for studying and addressing the above challenges RL. The proposed methods cover a wide range of RL settings (single and multi-agent systems (MAS) with all the variations in the latter, prediction and control, model-based and model-free methods, value-based and policy-based methods). In this work we propose the first results on several different problems: e.g. tensorization of the Bellman equation which allows exponential sample efficiency gains (Chapter 4), provable suboptimality arising from structural constraints in MAS(Chapter 3), combinatorial generalization results in cooperative MAS(Chapter 5), generalization results on observation shifts(Chapter 7), learning deterministic policies in a probabilistic RL framework(Chapter 6). Our algorithms exhibit provably enhanced performance and sample efficiency along with better scalability. Additionally, we also shed light on generalization aspects of the agents under different frameworks. These properties have been been driven by the use of several advanced tools (e.g. statistical machine learning, state abstraction, variational inference, tensor theory).

In summary, the contributions in this thesis significantly advance progress towards making RL agents ready for large scale, real world applications.

Actions


Access Document


Files:

Authors


More by this author
Division:
MPLS
Department:
Computer Science
Role:
Author

Contributors

Institution:
University of Oxford
Division:
MPLS
Department:
Computer Science
Role:
Supervisor
Role:
Examiner
Institution:
University of Oxford
Division:
MPLS
Department:
Computer Science
Role:
Examiner


More from this funder
Funder identifier:
http://dx.doi.org/10.13039/100017149
Funding agency for:
Mahajan, A
Programme:
Google-Deepmind Graduate Scholarship
More from this funder
Funding agency for:
Mahajan, A
Programme:
J.P. Morgan AI Fellowship
More from this funder
Funding agency for:
Mahajan, A
Programme:
Department of Computer Science Graduate Scholarship


Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
University of Oxford

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP