Thesis

Safe model based policy search

Abstract:: In this dissertation we focus on safe model based policy search, a subfield of reinforcement learning with two main objectives: data efficiency and safety. To achieve data efficient learning, we use Gaussian process regression to model the dynamics of unknown non-linear systems. The flexibility and probabilistic nature of GPs, along with their useful mathematical properties that often allow for closed form calculations, facilitate building accurate models efficiently, and using these models to optimise control policies for the underlying systems. Furthermore, our safety objective, also probabilistic in nature, is formalised as predefined state space constraints. The model's predictions are used to certify the safety of a candidate policy before deploying it on the system, and we thus manage to avoid constraint violations while training. We present an open source, openly available software tool implementing our proposed algorithm for safe and data efficient policy search. Furthermore, we propose a novel method for planning over multiple time steps with Gaussian processes, and provide formal guarantees bounding the predictive uncertainty. We consider safety and data efficiency critical challenges for the wider adoption of reinforcement learning algorithms, and we hope that our contributions will be useful in this effort.

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Cite

Cite this record

APA Style

Polymenakos, K. (2020). Safe model based policy search [PhD thesis]. University of Oxford.

MLA Style

Polymenakos, K. Safe Model Based Policy Search. University of Oxford, 2020.

Chicago Style

Polymenakos, K. 2020. “Safe Model Based Policy Search.” PhD thesis, University of Oxford.
Share
Print

Access Document

Files:: PhD_thesis_KP_changes_upload.pdf

(Preview, Dissemination version, pdf, 2.3MB, Terms of use)

Authors

+ Polymenakos, K More by this author

Role:: Author

Contributors

Role:: Supervisor

Role:: Supervisor

Role:: Examiner

+ Rammoorthy, S

Role:: Examiner

+ Engineering and Physical Sciences Research Council More from this funder

Funder identifier:: http://dx.doi.org/10.13039/501100000266

DOI:: 10.5287/ora-dpbgjyv66
Type of award:: DPhil
Level of award:: Doctoral
Awarding institution:: University of Oxford

Language:: English
Keywords:: Policy Search

Safety

Model based

Gaussian process

Reinforcement Learning
Subjects:: Machine learning

Reinforcement learning
Deposit date:: 2021-06-27

Terms of use

Copyright holder:: Polymenakos, K
Copyright date:: 2020

Licence:: Terms and Conditions of Use for Oxford University Research Archive

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP