Journal article icon

Journal article

ActiveEye: enabling continuous and responsive video understanding for smart eyewear systems

Abstract:
Integrating vision-language models (VLMs) with wearable devices offers great potential for continuous and responsive video understanding, a key capability for applications such as smart eyewear-based conversational assistants. However, achieving this on resource-constrained devices is challenging due to the high energy demands of continuous spatial-temporal sampling and transmission. We propose ActiveEye , a VLM designed for energy-efficient and responsive video understanding. ActiveEye separates visual and motion semantic representations and incorporates an active perception-based feedback path to adaptively adjust spatial-temporal sampling and transmission rates. Implemented as a wearable-mobile-cloud system, ActiveEye is evaluated for energy efficiency, real-time semantic change detection, and video understanding in both laboratory and field studies. Using the EgoSchema dataset, ActiveEye reduces the front-end energy consumption by 49.14%, supporting 8.37 hours of continuous operation on a 2.1 Wh battery. It achieves the highest F1 score (0.80) and the lowest average time difference (1.30 s) compared with heuristic-based event detection algorithms, validating its timely semantic detection. Furthermore, ActiveEye achieves a visual question answering (VQA) accuracy of 61.6%, which is comparable to state-of-the-art VLM agents, despite their reliance on larger language decoders and more computationally intensive frame selection strategies. Two rounds of in-field user evaluations further confirm its effectiveness in real-world settings, demonstrating its practical viability as a continuous and responsive video understanding system, conversational assistant, and wearable companion.
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Files:
Publisher copy:
10.1145/3770641

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
ORCID:
0000-0002-6220-029X


Publisher:
Association for Computing Machinery
Journal:
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies More from this journal
Volume:
9
Issue:
4
Pages:
1-33
Article number:
228
Publication date:
2025-12-02
Acceptance date:
2025-09-19
DOI:
EISSN:
2474-9567


Language:
English
Keywords:
Pubs id:
2348944
Local pid:
pubs:2348944
Deposit date:
2026-03-09
ARK identifier:

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP