Journal article
Capturing the geometry of object categories from video supervision
- Abstract:
- In this article, we are interested in capturing the 3D geometry of object categories simply by looking around them. Our unsupervised method fundamentally departs from traditional approaches that require either CAD models or manual supervision. It only uses video sequences capturing a handful of instances of an object category to train a deep architecture tailored for extracting 3D geometry predictions. Our deep architecture has three components. First, a Siamese viewpoint factorization network robustly aligns the input videos and, as a consequence, learns to predict the absolute category-specific viewpoint from a single image depicting any previously unseen instance of that category. Second, a depth estimation network performs monocular depth prediction. Finally, a 3D shape completion network predicts the full shape of the depicted object instance by re-using the output of the monocular depth prediction module. We also propose a way to configure networks so they can perform probabilistic predictions. We demonstrate that, properly used in our framework, this self-assessment mechanism is crucial for obtaining high quality predictions. Our network achieves state-of-the-art results on viewpoint prediction, depth estimation, and 3D point cloud estimation on public benchmarks.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
- 
                - 
                        
                        (Preview, Accepted manuscript, pdf, 1.2MB, Terms of use)
 
- 
                        
                        
- Publisher copy:
- 10.1109/tpami.2018.2871117
Authors
- Publisher:
- Institute of Electrical and Electronics Engineers
- Journal:
- IEEE Transactions on Pattern Analysis and Machine Intelligence More from this journal
- Volume:
- 42
- Issue:
- 2
- Pages:
- 261 - 275
- Publication date:
- 2018-06-14
- Acceptance date:
- 2016-12-19
- DOI:
- EISSN:
- 
                    1939-3539
- ISSN:
- 
                    0162-8828
- Pmid:
- 
                    30235118
- Language:
- 
                    English
- Keywords:
- Pubs id:
- 
                  pubs:920896
- UUID:
- 
                  uuid:51fa438e-ed36-4043-a6f4-a30609c9e428
- Local pid:
- 
                    pubs:920896
- Source identifiers:
- 
                  920896
- Deposit date:
- 
                    2018-10-23
Terms of use
- Copyright holder:
- IEEE
- Copyright date:
- 2018
- Notes:
- © 2018 IEEE. This is the accepted manuscript version of the article. The final version is available online from IEEE at: https://doi.org/10.1109/TPAMI.2018.2871117
If you are the owner of this record, you can report an update to it here: Report update to this record