Thesis icon

Thesis

Breaking the deadly triad in reinforcement learning

Abstract:

Reinforcement Learning (RL) is a promising framework for solving sequential decision making problems emerging from agent-environment interactions via trial and error. Off-policy learning is one of the most important techniques in RL, which enables an RL agent to learn from agent-environment interactions generated by a policy (i.e, a decision making rule that an agent relies on to interact with the environment) that is different from the policy of interest. Arguably, this flexibility is ke...

Expand abstract

Actions


Access Document


Files:

Authors


More by this author
Division:
MPLS
Department:
Computer Science
Role:
Author

Contributors

Institution:
University of Oxford
Division:
MPLS
Department:
Computer Science
Role:
Supervisor
Institution:
University of Oxford
Role:
Examiner
Institution:
Stanford University
Role:
Examiner
Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
University of Oxford
Language:
English
Keywords:
Subjects:
Deposit date:
2022-07-18

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP