Conference item
CoMAS: co-evolving multi-agent systems via interaction rewards
- Abstract:
-
Self-evolution is a central research topic in enabling large language model (LLM)- based agents to continually improve their capabilities after pretraining. Recent research has witnessed a transition from reinforcement learning (RL)-free to RLbased methods. Current RL-based methods either rely on dense external reward signals or extract intrinsic reward signals from LLMs themselves. However, these approaches diverge from the self-evolution mechanisms observed in human intelligence, where individuals learn and improve through mutual discussion and collaboration. In this work, we introduce Co-Evolving Multi-Agent Systems (CoMAS), a novel framework that enables agents to improve autonomously by learning from inter-agent interactions without external supervision. CoMAS generates intrinsic rewards from rich discussion dynamics, employs an LLM-as-a-judge mechanism to formulate these rewards, and optimizes each agent’s policy through RL, thereby enabling decentralized and scalable co-evolution. Experimental results demonstrate that CoMAS consistently outperforms untrained agents and achieves stateof-the-art performance across most evaluation settings. Ablation studies confirm the necessity of interaction-based reward signals and reveal promising scalability as the number and diversity of agents increase. These findings establish CoMAS as a novel and effective paradigm for self-evolution in LLM-based agents. Our code is available at: https://github.com/xxyQwQ/CoMAS.
- Publication status:
- Accepted
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Accepted manuscript, pdf, 1.7MB, Terms of use)
-
- Publication website:
- https://openreview.net/forum?id=ihwAzktmWc
Authors
- Publisher:
- OpenReview
- Host title:
- Proceedings of the 14th International Conference on Learning Representations (ICLR 2026)
- Article number:
- 5758
- Acceptance date:
- 2026-01-26
- Event title:
- 14th International Conference on Learning Representations (ICLR 2026)
- Event location:
- Rio de Janeiro, Brazil
- Event website:
- https://openreview.net/pdf?id=ihwAzktmWc
- Event start date:
- 2026-04-23
- Event end date:
- 2026-04-27
- Language:
-
English
- Pubs id:
-
2433615
- Local pid:
-
pubs:2433615
- Deposit date:
-
2026-06-15
- ARK identifier:
Terms of use
- Copyright holder:
- Xue et al.
- Copyright date:
- 2026
- Rights statement:
- © The Authors 2026.
- Notes:
- The author accepted manuscript (AAM) of this paper has been made available under the University of Oxford's Open Access Publications Policy, and a CC BY public copyright licence has been applied.
- Licence:
- CC Attribution (CC BY)
If you are the owner of this record, you can report an update to it here: Report update to this record