Conference item
DSO: aligning 3D generators with simulation feedback for physical soundness
- Abstract:
- Most 3D object generators prioritize aesthetic quality, often neglecting the physical constraints necessary for practical applications. One such constraint is that a 3D object should be self-supporting, i.e., remain balanced under gravity. Previous approaches to generating stable 3D objects relied on differentiable physics simulators to optimize geometry at test time, which is slow, unstable, and prone to local optima. Inspired by the literature on aligning generative models with external feedback, we propose Direct Simulation Optimization (DSO). This framework leverages feedback from a (non-differentiable) simulator to increase the likelihood that the 3D generator directly outputs stable 3D objects. We construct a dataset of 3D objects labeled with stability scores obtained from the physics simulator. This dataset enables fine-tuning of the 3D generator using the stability score as an alignment metric, via direct preference optimization (DPO) or direct reward optimization (DRO) - a novel objective we introduce to align diffusion models without requiring pairwise preferences. Our experiments demonstrate that the fine-tuned feed-forward generator, using either the DPO or DRO objective, is significantly faster and more likely to produce stable objects than test-time optimization. Notably, the DSO framework functions even without any ground-truth 3D objects for training, allowing the 3D generator to self-improve by automatically collecting simulation feedback on its own outputs.
- Publication status:
- Accepted
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Accepted manuscript, pdf, 13.5MB, Terms of use)
-
- Publication website:
- https://iccv.thecvf.com/virtual/2025/poster/2592
Authors
- Publisher:
- IEEE
- Article number:
- 165
- Acceptance date:
- 2025-07-23
- Event title:
- International Conference on Computer Vision (ICCV 2025)
- Event location:
- Honolulu, Hawai'i, USA
- Event website:
- https://iccv.thecvf.com/
- Event start date:
- 2025-10-19
- Event end date:
- 2025-10-23
- Language:
-
English
- Pubs id:
-
2300262
- Local pid:
-
pubs:2300262
- Deposit date:
-
2025-10-17
- ARK identifier:
Terms of use
- Copyright holder:
- Li et al
- Copyright date:
- 2025
- Rights statement:
- ©2025 The Authors.
- Notes:
-
This paper was presented at the International Conference on Computer Vision (ICCV 2025), 19th-23rd October 2025, Honolulu, Hawai'i, USA.
The author accepted manuscript (AAM) of this paper has been made available under the University of Oxford's Open Access Publications Policy, and a CC BY public copyright licence has been applied.
- Licence:
- CC Attribution (CC BY)
If you are the owner of this record, you can report an update to it here: Report update to this record