Thesis
Reinforcement learning in large state action spaces
- Abstract:
 - 
		
			
Reinforcement learning (RL) is a promising framework for training intelligent agents which learn to optimize long term utility by directly interacting with the environment. Creating RL methods which scale to large state-action spaces is a critical problem towards ensuring real world deployment of RL systems. However, several challenges limit the applicability of RL to large scale settings. These include difficulties with exploration, low sample efficiency, computational intractability, task constraints like decentralization and lack of guarantees about important properties like performance, generalization and robustness in potentially unseen scenarios.
This thesis is motivated towards bridging the aforementioned gap. We propose several principled algorithms and frameworks for studying and addressing the above challenges RL. The proposed methods cover a wide range of RL settings (single and multi-agent systems (MAS) with all the variations in the latter, prediction and control, model-based and model-free methods, value-based and policy-based methods). In this work we propose the first results on several different problems: e.g. tensorization of the Bellman equation which allows exponential sample efficiency gains (Chapter 4), provable suboptimality arising from structural constraints in MAS(Chapter 3), combinatorial generalization results in cooperative MAS(Chapter 5), generalization results on observation shifts(Chapter 7), learning deterministic policies in a probabilistic RL framework(Chapter 6). Our algorithms exhibit provably enhanced performance and sample efficiency along with better scalability. Additionally, we also shed light on generalization aspects of the agents under different frameworks. These properties have been been driven by the use of several advanced tools (e.g. statistical machine learning, state abstraction, variational inference, tensor theory).
In summary, the contributions in this thesis significantly advance progress towards making RL agents ready for large scale, real world applications.
 
Actions
Authors
Contributors
- Institution:
 - University of Oxford
 - Division:
 - MPLS
 - Department:
 - Computer Science
 - Role:
 - Supervisor
 
- Role:
 - Examiner
 
- Institution:
 - University of Oxford
 - Division:
 - MPLS
 - Department:
 - Computer Science
 - Role:
 - Examiner
 
- Funder identifier:
 - http://dx.doi.org/10.13039/100017149
 - Funding agency for:
 - Mahajan, A
 - Programme:
 - Google-Deepmind Graduate Scholarship
 
- Funding agency for:
 - Mahajan, A
 - Programme:
 - J.P. Morgan AI Fellowship
 
- Funding agency for:
 - Mahajan, A
 - Programme:
 - Department of Computer Science Graduate Scholarship
 
- Type of award:
 - DPhil
 - Level of award:
 - Doctoral
 - Awarding institution:
 - University of Oxford
 
- Language:
 - 
                    English
 - Keywords:
 - Subjects:
 - Deposit date:
 - 
                    2023-06-06
 
Terms of use
- Copyright holder:
 - Mahajan, A
 - Copyright date:
 - 2023
 
If you are the owner of this record, you can report an update to it here: Report update to this record