Conference item
Alternating optimisation and quadrature for robust control
- Abstract:
- Bayesian optimisation has been successfully applied to a variety of reinforcement learning problems. However, the traditional approach for learning optimal policies in simulators does not utilise the opportunity to improve learning by adjusting certain environment variables: state features that are unobservable and randomly determined by the environment in a physical setting but are controllable in a simulator. This paper considers the problem of finding a robust policy while taking into account the impact of environment variables. We present Alternating Optimisation and Quadrature (ALOQ), which uses Bayesian optimisation and Bayesian quadrature to address such settings. ALOQ is robust to the presence of significant rare events, which may not be observable under random sampling, but play a substantial role in determining the optimal policy. Experimental results across different domains show that ALOQ can learn more efficiently and robustly than existing methods.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Accepted manuscript, pdf, 581.2KB, Terms of use)
-
Authors
- Publisher:
- AAAI Press
- Host title:
- 32nd AAAI Conference on Artificial Intelligence (AAAI'18)
- Journal:
- 32nd AAAI Conference on Artificial Intelligence (AAAI'18) More from this journal
- Pages:
- 3925-3933
- Publication date:
- 2018-04-29
- Acceptance date:
- 2017-11-09
- ISSN:
-
2159-5399
- Keywords:
- Pubs id:
-
pubs:745008
- UUID:
-
uuid:abd7c997-b0fb-4e66-b601-f82184500cbf
- Local pid:
-
pubs:745008
- Source identifiers:
-
745008
- Deposit date:
-
2017-11-11
- ARK identifier:
Terms of use
- Copyright holder:
- Association for the Advancement of Artificial Intelligence
- Copyright date:
- 2018
- Notes:
- Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). This is the accepted manuscript version of the paper. The final version is available online from AAAI Press at: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16621
If you are the owner of this record, you can report an update to it here: Report update to this record