Geo4D: leveraging video generators for geometric 4Dscene reconstruction

Jiang, Z; Zheng, C; Laina, I; Larlus, D; Vedaldi, A

AI Collection

Conference item

Geo4D: leveraging video generators for geometric 4Dscene reconstruction

Abstract:: We introduce Geo4D, a method to repurpose video diffusion models for monocular 3D reconstruction of dynamic scenes. By leveraging the strong dynamic priors captured by largescale pre-trained video models, Geo4D can be trained using only synthetic data while generalizing well to real data in a zero-shot manner. Geo4D predicts several complementary geometric modalities, namely point, disparity, and ray maps. We propose a new multi-modal alignment algorithm to align and fuse these modalities, as well as a sliding window approach at inference time, thus enabling robust and accurate 4D reconstruction of long videos. Extensive experiments across multiple benchmarks show that Geo4D significantly surpasses state-of-the-art video depth estimation methods.

Publication status:: Accepted

Peer review status:: Peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Jiang, Z., Zheng, C., Laina, I., Larlus, D., & Vedaldi, A. (2025). Geo4D: leveraging video generators for geometric 4Dscene reconstruction. International Conference on Computer Vision (ICCV 2025).

MLA Style

Jiang, Z, et al. “Geo4D: Leveraging Video Generators for Geometric 4Dscene Reconstruction.” International Conference on Computer Vision (ICCV 2025), 2025.

Chicago Style

Jiang, Z, C Zheng, I Laina, D Larlus, and A Vedaldi. 2025. “Geo4D: Leveraging Video Generators for Geometric 4Dscene Reconstruction.” In International Conference on Computer Vision (ICCV 2025). IEEE.
Print

Access Document

Files:: Jiang_et_al_2025_Geo4D_leveraging_video.pdf

(Preview, Accepted manuscript, pdf, 18.4MB, Terms of use)

Authors

+ Jiang, Z More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Research group:: Visual Geometry Group
Role:: Author

+ Zheng, C More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Research group:: Visual Geometry Group
Role:: Author
ORCID:: 0000-0002-3584-9640

+ Laina, I More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Research group:: Visual Geometry Group
Role:: Author

+ Larlus, D More by this author

Role:: Author

+ Vedaldi, A More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Research group:: Visual Geometry Group
Oxford college:: New College
Role:: Author
ORCID:: 0000-0003-1374-2858

Publisher:: IEEE
Acceptance date:: 2025-07-23
Event title:: International Conference on Computer Vision (ICCV 2025)
Event location:: Honolulu, Hawai'i, USA
Event website:: https://iccv.thecvf.com/
Event start date:: 2025-10-19
Event end date:: 2025-10-23

Language:: English
Pubs id:: 2300211
Local pid:: pubs:2300211
Deposit date:: 2025-10-17
ARK identifier:: ark:/29072/ora_02ad21d3cecd41a5b350f841b86f6034

Terms of use

Notes:: This paper will be presented at the International Conference on Computer Vision (ICCV 2025), 19th-23rd October 2025, Honolulu, Hawai'i, USA.
The author accepted manuscript (AAM) of this paper has been made available under the University of Oxford's Open Access Publications Policy, and a CC BY public copyright licence has been applied.

Licence:: CC Attribution (CC BY)

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Conference item

Geo4D: leveraging video generators for geometric 4Dscene reconstruction

Actions

Access Document

Authors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Conference item

Geo4D: leveraging video generators for geometric 4Dscene reconstruction

Actions

Access Document

Authors

Bibliographic Details

Item Description

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions