Alternating optimisation and quadrature for robust control

Paul, S; Chatzilygeroudis, K; Ciosek, K; Mouret, J; Osborne, M; Whiteson, S

AI Collection

Conference item

Alternating optimisation and quadrature for robust control

Abstract:: Bayesian optimisation has been successfully applied to a variety of reinforcement learning problems. However, the traditional approach for learning optimal policies in simulators does not utilise the opportunity to improve learning by adjusting certain environment variables: state features that are unobservable and randomly determined by the environment in a physical setting but are controllable in a simulator. This paper considers the problem of finding a robust policy while taking into account the impact of environment variables. We present Alternating Optimisation and Quadrature (ALOQ), which uses Bayesian optimisation and Bayesian quadrature to address such settings. ALOQ is robust to the presence of significant rare events, which may not be observable under random sampling, but play a substantial role in determining the optimal policy. Experimental results across different domains show that ALOQ can learn more efficiently and robustly than existing methods.

Publication status:: Published

Peer review status:: Peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Paul, S., Chatzilygeroudis, K., Ciosek, K., Mouret, J., Osborne, M., & Whiteson, S. (2018). Alternating optimisation and quadrature for robust control. 3925–3933.

MLA Style

Paul, S, et al. “Alternating Optimisation and Quadrature for Robust Control.” 2018, pp. 3925–33.

Chicago Style

Paul, S, K Chatzilygeroudis, K Ciosek, J Mouret, M Osborne, and S Whiteson. 2018. “Alternating Optimisation and Quadrature for Robust Control.” 3925–33.
Print

Access Document

Files:: paulaaai18 (1).pdf

(Preview, Accepted manuscript, pdf, 581.2KB, Terms of use)

Authors

+ Paul, S More by this author

Role:: Author

+ Chatzilygeroudis, K More by this author

Role:: Author

+ Ciosek, K More by this author

Institution:: University of Oxford
Division:: MPLS Division
Department:: Computer Science
Role:: Author

+ Mouret, J More by this author

Role:: Author

+ Osborne, M More by this author

Institution:: University of Oxford
Division:: MPLS Division
Department:: Engineering Science
Role:: Author

More authors...

+ European Research Council More from this funder

Grant:: 637713; 637972

Publisher:: AAAI Press
Host title:: 32nd AAAI Conference on Artificial Intelligence (AAAI'18)
Journal:: 32nd AAAI Conference on Artificial Intelligence (AAAI'18) More from this journal
Pages:: 3925-3933
Publication date:: 2018-04-29
Acceptance date:: 2017-11-09
ISSN:: 2159-5399

Keywords:: Bayesian optimisation

Bayesian quadrature

reinforcement learning
Pubs id:: pubs:745008
UUID:: uuid:abd7c997-b0fb-4e66-b601-f82184500cbf
Local pid:: pubs:745008
Source identifiers:: 745008
Deposit date:: 2017-11-11
ARK identifier:: ark:/29072/ora_abd7c997b0fb4e66b601f82184500cbf

Terms of use

Copyright holder:: Association for the Advancement of Artificial Intelligence
Notes:: Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). This is the accepted manuscript version of the paper. The final version is available online from AAAI Press at: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16621

Licence:: Terms and Conditions of Use for Oxford University Research Archive

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Conference item

Alternating optimisation and quadrature for robust control

Actions

Access Document

Authors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Conference item

Alternating optimisation and quadrature for robust control

Actions

Access Document

Authors

Funding

Bibliographic Details

Item Description

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions