Thesis
Self-supervised learning of structural representations of visual objects
- Abstract:
- This thesis explores how a computer can learn the structure of visual objects in the absence of strong supervision using self-supervised learning. We demonstrate that we can learn structural representations of objects using an autoencoding framework with reconstruction as the key learning signal. We do this by engineering bottlenecks that disentangle object structure from other factors of variation. Moreover, we design the bottlenecks to represent the object structure in the form of 2D and 3D object landmarks or 3D mesh. Specifically, we develop a method that automatically discovers 2D object landmarks without any annotations using a conditional autoencoder with 2D keypoint bottleneck that disentangles pose, represented as 2D keypoints, and appearance. Despite the ability of self-supervised learning methods to learn stable object landmarks, the automatically discovered landmarks are not aligned with landmarks that would be annotated by human annotators. To address this, we present a method that can inject an unpaired empirical prior into a conditional autoencoder by introducing a novel landmark autoencoding that can leverage powerful image discriminators used in adversarial learning. A by-product of these conditional autoencoding methods is that the generation can be interactively controlled by manipulating the keypoints in the bottleneck. We leverage this feature in a novel method for interactive 3D shape deformation. The method is trained in a self-supervised way to use automatically discovered 3D landmarks to align pairs of 3D shapes. In the test time, the method allows the user to interactively deform the object shape via the discovered 3D object landmarks. Finally, we present a method that uses a photo-geometric autoencoder to recover 3D shape of an object category without any 3D annotations. It uses videos for training and learns to disentangle an image input into a rigid pose, texture and deformable shape model.
Actions
+ Clarendon Fund
More from this funder
- Funder identifier:
- http://dx.doi.org/10.13039/501100014748
- Funding agency for:
- Jakab, T
- DOI:
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Subjects:
- Pubs id:
-
2043072
- Local pid:
-
pubs:2043072
- Deposit date:
-
2022-07-08
Terms of use
- Copyright holder:
- Tomas Jakab
- Copyright date:
- 2021
If you are the owner of this record, you can report an update to it here: Report update to this record