QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning

Rashid, T; Samvelyan, M; Schroeder de Witt, C; Farquhar, G; Foerster, J; Whiteson, S

AI Collection

Conference item

QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning

Abstract:: In many real-world settings, a team of agents must coordinate their behaviour while acting in a decentralised way. At the same time, it is often possible to train the agents in a centralised fashion in a simulated or laboratory setting, where global state information is available and communication constraints are lifted. Learning joint actionvalues conditioned on extra state information is an attractive way to exploit centralised learning, but the best strategy for then extracting decentralised policies is unclear. Our solution is QMIX, a novel value-based method that can train decentralised policies in a centralised end-to-end fashion. QMIX employs a network that estimates joint action-values as a complex non-linear combination of per-agent values that condition only on local observations. We structurally enforce that the joint-action value is monotonic in the per-agent values, which allows tractable maximisation of the joint action-value in off-policy learning, and guarantees consistency between the centralised and decentralised policies. We evaluate QMIX on a challenging set of StarCraft II micromanagement tasks, and show that QMIX significantly outperforms existing value-based multi-agent reinforcement learning methods.

Publication status:: Published

Peer review status:: Peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Rashid, T., Samvelyan, M., Schroeder de Witt, C., Farquhar, G., Foerster, J., & Whiteson, S. (2018). QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning.

MLA Style

Rashid, T, et al. “QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning.” 2018.

Chicago Style

Rashid, T, M Samvelyan, C Schroeder de Witt, G Farquhar, J Foerster, and S Whiteson. 2018. “QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning.”
Print

Access Document

Files:: Whiteson et al, QMIX - Monotonic value function factorisation ...

(Preview, Version of record, pdf, 3.6MB, Terms of use)

Authors

+ Rashid, T More by this author

Institution:: University of Oxford
Division:: MPLS Division
Department:: Computer Science
Role:: Author

+ Samvelyan, M More by this author

Role:: Author

+ Schroeder de Witt, C More by this author

Institution:: University of Oxford
Division:: MPLS Division
Department:: Engineering Science
Role:: Author

+ Farquhar, G More by this author

Institution:: University of Oxford
Division:: MPLS Division
Department:: Engineering Science
Role:: Author

+ Foerster, J More by this author

Institution:: University of Oxford
Division:: MPLS Division
Department:: Engineering Science
Role:: Author

More authors...

+ Engineering and Physical Sciences Research Council More from this funder

Grant:: EP/N509711/1

+ Horizon 2020 More from this funder

Grant:: 637713

+ NVIDIA More from this funder

+ Microsoft More from this funder

+ Innovation Fund Denmark More from this funder

More funders...

Publisher:: Journal of Machine Learning Research
Host title:: 35th International Conference on Machine Learning (ICML 2018)
Journal:: 35th International Conference on Machine Learning (ICML 2018) More from this journal
Publication date:: 2018-07-03
Acceptance date:: 2018-06-12

Pubs id:: pubs:857023
UUID:: uuid:4e16ec00-f9e2-48ef-83fe-92e2b845fb87
Local pid:: pubs:857023
Source identifiers:: 857023
Deposit date:: 2018-06-12
ARK identifier:: ark:/29072/ora_4e16ec00f9e248ef83fe92e2b845fb87

Terms of use

Copyright holder:: Whiteson et al
Notes:: Copyright 2018 by the author(s). This is the accepted manuscript version of the article. The final version is available online from Journal of Machine Learning Research at: http://proceedings.mlr.press/v80/rashid18a.html

Licence:: Terms and Conditions of Use for Oxford University Research Archive

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Conference item

QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning

Actions

Access Document

Authors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Conference item

QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning

Actions

Access Document

Authors

Funding

Bibliographic Details

Item Description

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions