Journal article icon

Journal article

Select2Plan: training-free ICL-based planning through VQA and memory retrieval

Abstract:
We introduce Select2Plan (S2P), a novel training-free framework for high-level robot planning that leverages off-the-shelf VLMs for autonomous navigation. Unlike most learning-based approaches that require extensive task-specific training and large-scale data collection, S2P overcomes the need for fine-tuning by adapting inputs to align with the VLM's pretraining data. Our method achieves this through a combination of structured Visual Question Answering (VQA) to ground action selection on the image, and In-Context Learning (ICL) to exploit knowledge drawn from relevant examples from a memory bank of (visually) annotated data, which can include diverse, in-the-wild sources. We demonstrate S2P flexibility by evaluating it in both First-Person View (FPV) and Third-Person View (TPV) navigation. S2P improves the performance of a baseline VLM by 40% in TPV and surpasses end-to-end trained models by approximately 24% in FPV when tasked with navigating towards unseen objects in novel scenes. These results highlight the adaptability, simplicity, and effectiveness of our training-free approach, demonstrating that the use of pre-trained VLMs with structured memory retrieval enables robust high-level robot planning without costly task-specific training. Our experiments also show that retrieving samples from heterogeneous data sources, including online videos of different robots or humans walking, is highly beneficial for navigation. Notably, our method effectively generalizes to novel scenarios, requiring only a handful of demonstrations. Project Page: lambdavi.github.io/select2plan
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Publisher copy:
10.1109/LRA.2025.3606790

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
ORCID:
0009-0006-0259-5732
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author


Publisher:
IEEE
Journal:
IEEE Robotics and Automation Letters More from this journal
Volume:
10
Issue:
11
Pages:
11267-11274
Publication date:
2025-09-04
Acceptance date:
2025-08-13
DOI:
EISSN:
2377-3766


Language:
English
Keywords:
Pubs id:
2288816
UUID:
uuid_37ba117d-385f-4893-b7a4-a57cfbf11432
Local pid:
pubs:2288816
Deposit date:
2025-11-21
ARK identifier:

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP