Conference item
Conditionally optimistic exploration for cooperative deep multi-agent reinforcement learning
- Abstract:
- Efficient exploration is critical in cooperative deep Multi-Agent Reinforcement Learning (MARL). In this work, we propose an exploration method that effectively encourages cooperative exploration based on the idea of sequential action-computation scheme. The high-level intuition is that to perform optimism-based exploration, agents would explore cooperative strategies if each agent's optimism estimate captures a structured dependency relationship with other agents. Assuming agents compute actions following a sequential order at each environment timestep, we provide a perspective to view MARL as tree search iterations by considering agents as nodes at different depths of the search tree. Inspired by the theoretically justified tree search algorithm UCT (Upper Confidence bounds applied to Trees), we develop a method called Conditionally Optimistic Exploration (COE). COE augments each agent's state-action value estimate with an action-conditioned optimistic bonus derived from the visitation count of the global state and joint actions of preceding agents. COE is performed during training and disabled at deployment, making it compatible with any value decomposition method for centralized training with decentralized execution. Experiments across various cooperative MARL benchmarks show that COE outperforms current state-of-the-art exploration methods on hard-exploration tasks.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Version of record, pdf, 473.1KB, Terms of use)
-
- Publication website:
- https://proceedings.mlr.press/v216/zhao23b.html
Authors
- Publisher:
- Journal of Machine Learning Research
- Host title:
- Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence
- Volume:
- 216
- Pages:
- 2529-2540
- Article number:
- 236
- Publication date:
- 2023-01-01
- Event title:
- 39th Conference on Uncertainty in Artificial Intelligence (UAI 2023)
- Event location:
- Pittsburgh, Pennsylvania, USA
- Event website:
- https://www.auai.org/uai2023/
- Event start date:
- 2023-07-31
- Event end date:
- 2023-08-04
- EISSN:
-
2640-3498
- Language:
-
English
- Pubs id:
-
1536300
- Local pid:
-
pubs:1536300
- Deposit date:
-
2026-06-16
- ARK identifier:
Terms of use
- Copyright holder:
- Zhao et al.
- Copyright date:
- 2023
- Rights statement:
- © 2023 The Author(s). Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License.
- Licence:
- CC Attribution (CC BY)
If you are the owner of this record, you can report an update to it here: Report update to this record