Trading performance for stability in Markov decision processes

Brazdil, T; Chatterjee, K; Forejt, V; Kucera, A

Journal article

Trading performance for stability in Markov decision processes

Abstract:: We study controller synthesis problems for finite-state Markov decision processes, where the objective is to optimize the expected mean-payoff performance and stability (also known as variability in the literature). We argue that the basic notion of expressing the stability using the statistical variance of the mean payoff is sometimes insufficient, and propose an alternative definition.

We show that a strategy ensuring both the expected mean payoff and the variance below given bounds requires randomization and memory, under both the above definitions. We then show that the problem of finding such a strategy can be expressed as a set of constraints.

Publication status:: Published

Peer review status:: Peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Cite

Cite this record

APA Style

Brazdil, T., Chatterjee, K., Forejt, V., & Kucera, A. (2016). Trading performance for stability in Markov decision processes. Journal of Computer and System Sciences, 125, 70–81.

MLA Style

Brazdil, T., et al. “Trading Performance for Stability in Markov Decision Processes.” Journal of Computer and System Sciences, vol. 125, Elsevier, 2016, pp. 70–81.

Chicago Style

Brazdil, T, K Chatterjee, V Forejt, and A Kucera. 2016. “Trading Performance for Stability in Markov Decision Processes.” Journal of Computer and System Sciences 125: 70–81.
Share
Print

Access Document

Files:: Forejt et al, Trading performance for stability in Markov deci...

(Preview, Version of record, pdf, 692.0KB, Terms of use)

Publisher copy:: 10.1016/j.oceaneng.2016.08.007

Authors

+ Brazdil, T More by this author

Role:: Author

+ Chatterjee, K More by this author

Role:: Author

+ Forejt, V More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Computer Science
Role:: Author

+ Kucera, A More by this author

Role:: Author

+ Engineering and Physical Sciences Research Council More from this funder

Funding agency for:: Forejt, V
Grant:: EP/M023656/1

+ Czech Science Foundation More from this funder

Grant:: P202/12/P612

+ European Research Council More from this funder

Grant:: 279307

+ Austrian Science Fund More from this funder

Grant:: S 11407-N23

Publisher:: Elsevier
Journal:: Journal of Computer and System Sciences More from this journal
Volume:: 125
Pages:: 70–81
Publication date:: 2016-10-01
Acceptance date:: 2016-09-23
DOI:: 10.1016/j.oceaneng.2016.08.007
EISSN:: 1090-2724
ISSN:: 0022-0000

Pubs id:: pubs:652697
UUID:: uuid:98165cbf-b07b-4de3-977c-46b3131b216b
Local pid:: pubs:652697
Source identifiers:: 652697
Deposit date:: 2016-10-17

Terms of use

Copyright holder:: Forejt et al
Notes:: © 2016 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Licence:: CC Attribution (CC BY)

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Journal article

Trading performance for stability in Markov decision processes

Actions

Access Document

Authors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Journal article

Trading performance for stability in Markov decision processes

Actions

Access Document

Authors

Funding

Bibliographic Details

Item Description

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions