Towards data-efficient deployment of reinforcement learning systems

Schulze, S

Thesis

Towards data-efficient deployment of reinforcement learning systems

Abstract:: A fundamental concern in the deployment of artificial agents in real-life is their capacity to quickly adapt to their surroundings. Traditional reinforcement learning (RL) struggles with this requirement in two ways. Firstly, iterative exploration of unconstrained environment dynamics yields numerous uninformative updates and consequently slow adaptation. Secondly, final policies have no capacity to adapt to future observations and have to either slowly learn indefinitely or retrain entirely as observations occur.

This thesis explores two formulations aimed at addressing these issues. The consideration of entire task distributions in meta-RL evolves policies quickly adapting to specific instances on their own. By forcing agents to specifically request feedback, Active RL enforces selective observations and updates. Both of these formulations reduce to a Bayes-Adaptive setting in which a probabilistic belief over possible environments is maintained. Many existing solutions only provide asymptotic guarantees that are of limited use in practical contexts. We develop a variational approach to approximate belief management and support its validity empirically through a broad range of ablations. We then consider recently successful planning approaches but uncover and discuss obstacles in their application to the discussed settings.

An important factor influencing the data requirements and stability of RL systems is the choice of appropriate hyperparameters. We develop a Bayesian optimisation approach exploiting the iterative structure of training processes whose empiric performance exceeds that of existing baselines.

A final contribution of this thesis concerns increasing the scalability and expressiveness of Gaussian Processes (GPs). While we make no direct use of the presented framework, GPs have been used to model probabilistic beliefs in closely related settings.

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Cite

Cite this record

APA Style

Schulze, S. (2021). Towards data-efficient deployment of reinforcement learning systems [PhD thesis]. University of Oxford.

MLA Style

Schulze, S. Towards Data-Efficient Deployment of Reinforcement Learning Systems. University of Oxford, 2021.

Chicago Style

Schulze, S. 2021. “Towards Data-Efficient Deployment of Reinforcement Learning Systems.” PhD thesis, University of Oxford.
Share
Print