Conference item icon

Conference item

CoMAS: co-evolving multi-agent systems via interaction rewards

Abstract:

Self-evolution is a central research topic in enabling large language model (LLM)- based agents to continually improve their capabilities after pretraining. Recent research has witnessed a transition from reinforcement learning (RL)-free to RLbased methods. Current RL-based methods either rely on dense external reward signals or extract intrinsic reward signals from LLMs themselves. However, these approaches diverge from the self-evolution mechanisms observed in human intelligence, where individuals learn and improve through mutual discussion and collaboration. In this work, we introduce Co-Evolving Multi-Agent Systems (CoMAS), a novel framework that enables agents to improve autonomously by learning from inter-agent interactions without external supervision. CoMAS generates intrinsic rewards from rich discussion dynamics, employs an LLM-as-a-judge mechanism to formulate these rewards, and optimizes each agent’s policy through RL, thereby enabling decentralized and scalable co-evolution. Experimental results demonstrate that CoMAS consistently outperforms untrained agents and achieves stateof-the-art performance across most evaluation settings. Ablation studies confirm the necessity of interaction-based reward signals and reveal promising scalability as the number and diversity of agents increase. These findings establish CoMAS as a novel and effective paradigm for self-evolution in LLM-based agents. Our code is available at: https://github.com/xxyQwQ/CoMAS.

Publication status:
Accepted
Peer review status:
Peer reviewed

Actions

Access Document

Files:
Publication website:
https://openreview.net/forum?id=ihwAzktmWc

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author


Publisher:
OpenReview
Host title:
Proceedings of the 14th International Conference on Learning Representations (ICLR 2026)
Article number:
5758
Acceptance date:
2026-01-26
Event title:
14th International Conference on Learning Representations (ICLR 2026)
Event location:
Rio de Janeiro, Brazil
Event website:
https://openreview.net/pdf?id=ihwAzktmWc
Event start date:
2026-04-23
Event end date:
2026-04-27


Language:
English
Pubs id:
2433615
Local pid:
pubs:2433615
Deposit date:
2026-06-15
ARK identifier:

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP