Conference item icon

Conference item

Counterfactual multi−agent policy gradients

Abstract:
Many real-world problems, such as network packet routing and the coordination of autonomous vehicles, are naturally modelled as cooperative multi-agent systems. There is a great need for new reinforcement learning methods that can ef- ficiently learn decentralised policies for such systems. To this end, we propose a new multi-agent actor-critic method called counterfactual multi-agent (COMA) policy gradients. COMA uses a centralised critic to estimate the Q-function and decentralised actors to optimise the agents’ policies. In addition, to address the challenges of multi-agent credit assignment, it uses a counterfactual baseline that marginalises out a single agent’s action, while keeping the other agents’ actions fixed. COMA also uses a critic representation that allows the counterfactual baseline to be computed efficiently in a single forward pass. We evaluate COMA in the testbed of StarCraft unit micromanagement, using a decentralised variant with significant partial observability. COMA significantly improves average performance over other multi-agent actorcritic methods in this setting, and the best performing agents are competitive with state-of-the-art centralised controllers that get access to the full state.
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Files:

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Computer Science
Oxford college:
St Catherine's College
Role:
Author


Publisher:
AAAI Press
Host title:
32nd AAAI Conference on Artificial Intelligence (AAAI'18)
Journal:
32nd AAAI Conference on Artificial Intelligence (AAAI'18) More from this journal
Pages:
2974-2982
Publication date:
2018-04-29
Acceptance date:
2017-11-09
ISSN:
2159-5399


Keywords:
Pubs id:
pubs:745007
UUID:
uuid:37e732fe-a876-4699-8ee3-d556bfd235b3
Local pid:
pubs:745007
Source identifiers:
745007
Deposit date:
2017-11-11

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP