Journal article icon

Journal article

Text-guided camouflaged object detection

Abstract:
Camouflaged object detection (COD) aims to identify and segment objects hidden in the background due to their high similarity in colour or texture. Recent efforts have investigated the utilization of foundation models to provide extra supervision for solving this task, such as object masks predicted by Segment Anything Models and labelled camouflaged object images generated by Stable Diffusion Models. In this work, instead of visual supervision, we endeavour to utilize Multimodal Large Language Models (MLLM) to provide textual information for COD. Specifically, we propose the Text-Guided Camouflaged Object Detection (TG-COD) framework, which consists of two main stages: extracting pseudo-textual description from MLLM and integrating textual information in the COD process. Our framework leverages the knowledge of MLLM and guides the detection process with textual information. Comprehensive experiments demonstrate that our method achieves state-of-the-art (SOTA) performance and also exhibits strong generalization capability in a few-shot setting.
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Files:
Publisher copy:
10.1016/j.patcog.2025.112058

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
ORCID:
0009-0006-0259-5732
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author


Publisher:
Elsevier
Journal:
Pattern Recognition More from this journal
Volume:
170
Article number:
112058
Publication date:
2025-07-07
Acceptance date:
2025-06-23
DOI:
EISSN:
1873-5142
ISSN:
0031-3203


Language:
English
Keywords:
Pubs id:
2242784
Local pid:
pubs:2242784
Deposit date:
2025-08-07

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP