Conference item
MALT: improving reasoning with multi-agent LLM training
- Abstract:
-
Large Language Models (LLMs) often produce answers with a single chain-of-thought, which restricts their ability to explore reasoning paths or self-correct flawed outputs in complex tasks. In this paper, we introduce MALT (Multi-Agent LLM Training), a novel post-training strategy that divides the reasoning process into generation, verification, and refinement steps using a sequential pipeline of heterogeneous agents. During data generation, each agent is repeatedly sampled to form a multi-agent search tree, where final outputs are graded against ground-truth data. We then apply value iteration to propagate reward signals back to each role-conditioned model, automatically producing multi-agent post-training data without human or teacher-model supervision. Our off-policy approach allows each agent to specialize by learning from correct and incorrect trajectories, ultimately improving the end-to-end reasoning chain. On MATH, GSM8K, and CSQA, MALT surpasses the same baseline LLM with a relative improvement of 15.66%, 7.42%, and 9.40% respectively, making it an important advance towards multi-agent cooperative training.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Version of record, pdf, 511.1KB, Terms of use)
-
- Publication website:
- https://openreview.net/forum?id=jXP9bgFack#discussion
Authors
- Publisher:
- OpenReview
- Article number:
- 466
- Publication date:
- 2025-07-08
- Acceptance date:
- 2025-07-07
- Event title:
- 2nd Conference on Language Modeling (COLM 2025)
- Event location:
- Montreal, Canada
- Event website:
- https://colmweb.org/index.html
- Event start date:
- 2025-10-07
- Event end date:
- 2025-10-10
- Language:
-
English
- Pubs id:
-
2287121
- Local pid:
-
pubs:2287121
- Deposit date:
-
2025-09-09
Terms of use
- Copyright holder:
- Motwani et al.
- Copyright date:
- 2025
- Rights statement:
- Copyright © 2025 The Author(s). This is an open access article published under CC BY 4.0.
- Licence:
- CC Attribution (CC BY)
If you are the owner of this record, you can report an update to it here: Report update to this record