Communicating via Markov decision processes

Sokota, S; de Witt, CS; Igl, M; Zintgraf, L; Torr, P; Strohmeier, M; Kolter, JZ; Whiteson, S; Foerster, J

AI Collection

Conference item

Communicating via Markov decision processes

Abstract:: We consider the problem of communicating exogenous information by means of Markov decision process trajectories. This setting, which we call a Markov coding game (MCG), generalizes both source coding and a large class of referential games. MCGs also isolate a problem that is important in decentralized control settings in which cheap-talk is not available-namely, they require balancing communication with the associated cost of communicating. We contribute a theoretically grounded approach to MCGs based on maximum entropy reinforcement learning and minimum entropy coupling that we call MEME. Due to recent breakthroughs in approximation algorithms for minimum entropy coupling, MEME is not merely a theoretical algorithm, but can be applied to practical settings. Empirically, we show both that MEME is able to outperform a strong baseline on small MCGs and that MEME is able to achieve strong performance on extremely large MCGs. To the latter point, we demonstrate that MEME is able to losslessly communicate binary images via trajectories of Cartpole and Pong, while simultaneously achieving the maximal or near maximal expected returns, and that it is even capable of performing well in the presence of actuator noise.

Publication status:: Published

Peer review status:: Peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Sokota, S., de Witt, C. S., Igl, M., Zintgraf, L., Torr, P., Strohmeier, M., Kolter, J. Z., Whiteson, S., & Foerster, J. (2022). Communicating via Markov decision processes. 39th International Conference on Machine Learning (ICML 2022), 162, 20314–20328.

MLA Style

Sokota, S, et al. “Communicating via Markov Decision Processes.” 39th International Conference on Machine Learning (ICML 2022), Proceedings of Machine Learning Research, vol. 162, 2022, pp. 20314–28.

Chicago Style

Sokota, S, CS de Witt, M Igl, L Zintgraf, P Torr, M Strohmeier, JZ Kolter, S Whiteson, and J Foerster. 2022. “Communicating via Markov Decision Processes.” In 39th International Conference on Machine Learning (ICML 2022), 162:20314–28. Proceedings of Machine Learning Research. Journal of Machine Learning Research.
Print

Access Document

Files:: Sokota_et_al_2023_communicating_via_markov.pdf

(Preview, Accepted manuscript, pdf, 1.6MB, Terms of use)

Authors

+ Sokota, S More by this author

Role:: Author

+ de Witt, CS More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Role:: Author
ORCID:: 0000-0003-4245-1179

+ Igl, M More by this author

Role:: Author

+ Zintgraf, L More by this author

Role:: Author

+ Torr, P More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Role:: Author

More authors...

Publisher:: Journal of Machine Learning Research
Host title:: Proceedings of the 39th International Conference on Machine Learning (ICML 2022)
Volume:: 162
Pages:: 20314-20328
Series:: Proceedings of Machine Learning Research
Publication date:: 2022-01-01
Event title:: 39th International Conference on Machine Learning (ICML 2022)
Event location:: Baltimore, MD, USA
Event website:: https://icml.cc/Conferences/2022
Event start date:: 2022-07-17
Event end date:: 2022-07-23
EISSN:: 2640-3498
ISSN:: 2640-3498

Language:: English
Keywords:: FFR
Pubs id:: 1336644
Local pid:: pubs:1336644
Deposit date:: 2023-07-24
ARK identifier:: ark:/29072/ora_37f38590f57749819cf5f818184786b7

Terms of use

Copyright holder:: Sokota et al
Notes:: This paper was presented at the 39th International Conference on Machine Learning (ICML 2022), 17th-23rd July 2022, Baltimore, MD, USA.

Licence:: Terms and Conditions of Use for Oxford University Research Archive

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Conference item

Communicating via Markov decision processes

Actions

Access Document

Authors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Conference item

Communicating via Markov decision processes

Actions

Access Document

Authors

Bibliographic Details

Item Description

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions