Conference item
Paper2Poster: towards multimodal poster automation from scientific papers
- Abstract:
- Academic poster generation is a crucial yet challenging task in scientific communication, requiring the compression of long-context interleaved documents into a single, visually coherent page. To address this challenge, we introduce the first benchmark and metric suite for poster generation, which pairs recent conference papers with author-designed posters and evaluates outputs on (i) Visual Quality—semantic alignment with human posters, (ii) Textual Coherence—language fluency, (iii) Holistic Assessment—six fine-grained aesthetic and informational criteria scored by a VLM-as-judge, and notably (iv) PaperQuiz—the poster’s ability to convey core paper content as measured by VLMs answering generated quizzes. Building on this benchmark, we propose PosterAgent, a top-down, visualin-the-loop multi-agent pipeline: the (a) Parser distills the paper into a structured asset library; the (b) Planner aligns text–visual pairs into a binary-tree layout that preserves reading order and spatial balance; and the (c) Painter–Commenter loop refines each panel by executing rendering code and using VLM feedback to eliminate overflow and ensure alignment. In our comprehensive evaluation, we find that GPT-4o outputs—though visually appealing at first glance—often exhibit noisy text and poor PaperQuiz scores, and we find that reader engagement is the primary aesthetic bottleneck, as human-designed posters rely largely on visual semantics to convey meaning. Our fully open-source variants (e.g., based on the Qwen-2.5 series) outperform existing 4o-driven multi-agent systems across nearly all metrics, while using 87% fewer tokens. It transforms a 22-page paper into a finalized yet editable ‘.pptx’ poster — all for just USD 0.005. These findings chart clear directions for the next generation of fully automated poster-generation models. The code and datasets are available at https://github.com/Paper2Poster/Paper2Poster.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Version of record, pdf, 31.3MB, Terms of use)
-
- Publication website:
- https://openreview.net/forum?id=p0E74lpRBD
Authors
+ Engineering and Physical Sciences Research Council
More from this funder
- Funder identifier:
- https://ror.org/0439y7842
- Grant:
- EP/W002981/1
- Publisher:
- OpenReview
- Host title:
- 39th Conference on Neural Information Processing Systems (NeurIPS 2025) Position Paper Track
- Publication date:
- 2025-12-03
- Acceptance date:
- 2025-09-18
- Event title:
- 39th Annual Conference on Neural Information Processing Systems (NeurIPS 2025)
- Event location:
- San Diego, CA, USA
- Event website:
- https://neurips.cc/Conferences/2025
- Event start date:
- 2025-12-02
- Event end date:
- 2025-12-07
- Language:
-
English
- Pubs id:
-
2364167
- Local pid:
-
pubs:2364167
- Deposit date:
-
2026-01-27
- ARK identifier:
Terms of use
- Copyright date:
- 2025
- Rights statement:
- This paper has been made open access via Creative Commons licensing (https://creativecommons.org/licenses/by/4.0/).
- Notes:
- This paper was presented at the 39th Annual Conference on Neural Information Processing Systems (NeurIPS 2025), 2nd-7th December 2025, San Diego, CA, USA.
If you are the owner of this record, you can report an update to it here: Report update to this record