VFMF: dense forecasting by generating foundation model features

Boduljak, G; Lan, Y; Rupprecht, C; Vedaldi, A

AI Collection

Conference item

VFMF: dense forecasting by generating foundation model features

Abstract:: Forecasting from partial observations is central to world modeling. Many recent methods represent the world through images, and reduce forecasting to stochastic video generation. Although such methods excel at realism and visual fidelity, predicting pixels is computationally intensive and not directly useful in many applications, as it requires translating RGB into signals useful for decision making. An alternative approach uses features from vision foundation models (VFMs) as world representations, performing deterministic regression to predict future world states. These features can be directly translated into actionable signals such as semantic segmentation and depth, while remaining computationally efficient. However, deterministic regression averages over multiple plausible futures, undermining forecast accuracy by failing to capture uncertainty. To address this crucial limitation, we introduce a generative forecaster that performs autoregressive flow matching in VFM feature space. Our key insight is that generative modeling in this space requires encoding VFM features into a compact latent space suitable for diffusion. We show that this latent space preserves information more effectively than previously used PCA-based alternatives, both for forecasting and other applications, such as image generation. Our latent predictions can be easily decoded into multiple useful and interpretable output modalities: semantic segmentation, depth, surface normals, and even RGB. With matched architecture and compute, our method produces sharper and more accurate predictions than regression across all modalities. Our results suggest that stochastic conditional generation of VFM features offers a promising and scalable foundation for future world models.

Publication status:: Accepted

Peer review status:: Peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Boduljak, G., Lan, Y., Rupprecht, C., & Vedaldi, A. (2026). VFMF: dense forecasting by generating foundation model features. 43rd International Conference on Machine Learning (ICML 2026).

MLA Style

Boduljak, G, et al. “VFMF: Dense Forecasting by Generating Foundation Model Features.” 43rd International Conference on Machine Learning (ICML 2026), 2026.

Chicago Style

Boduljak, G, Y Lan, C Rupprecht, and A Vedaldi. 2026. “VFMF: Dense Forecasting by Generating Foundation Model Features.” In 43rd International Conference on Machine Learning (ICML 2026). International Conference on Machine Learning.
Print

Authors

+ Boduljak, G More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Role:: Author

+ Lan, Y More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Role:: Author

+ Rupprecht, C More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Role:: Author

+ Vedaldi, A More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Oxford college:: New College
Role:: Author
ORCID:: 0000-0003-1374-2858

Publisher:: International Conference on Machine Learning
Acceptance date:: 2026-01-26
Event title:: 43rd International Conference on Machine Learning (ICML 2026)
Event location:: Seoul, South Korea
Event website:: https://icml.cc/Conferences/2026
Event start date:: 2026-07-06
Event end date:: 2026-07-11

Language:: English
Pubs id:: 2434272
Local pid:: pubs:2434272
Deposit date:: 2026-06-17
ARK identifier:: ark:/29072/ora_fdc9b2da1b6b4d56a701631fe1a78207

Terms of use

Notes:: This conference paper has been accepted for presentation at the 43rd International Conference on Machine Learning, Seoul, South Korea, July 6th - 11th, 2026.

Licence:: Terms and Conditions of Use for Oxford University Research Archive

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Conference item

VFMF: dense forecasting by generating foundation model features

Actions

Authors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Conference item

VFMF: dense forecasting by generating foundation model features

Actions

Authors

Bibliographic Details

Item Description

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions