Semantic-aware auto-encoders for self-supervised representation learning

Wang, G; Tang, Y; Lin, L; Torr, PHS

Conference item

Semantic-aware auto-encoders for self-supervised representation learning

Abstract:: The resurgence of unsupervised learning can be attributed to the remarkable progress of self-supervised learning, which includes generative $(\mathcal{G})$ and discriminative $(\mathcal{D})$ models. In computer vision, the mainstream self-supervised learning algorithms are $\mathcal{D}$ models. However, designing a $\mathcal{D}$ model could be over-complicated; also, some studies hinted that a $\mathcal{D}$ model might not be as general and interpretable as a $\mathcal{G}$ model. In this paper, we switch from $\mathcal{D}$ models to $\mathcal{G}$ models using the classical auto-encoder $(AE)$ . Note that a vanilla $\mathcal{G}$ model was far less efficient than a $\mathcal{D}$ model in self-supervised computer vision tasks, as it wastes model capability on overfitting semantic-agnostic high-frequency details. Inspired by perceptual learning that could use cross-view learning to perceive concepts and semantics 1 1 Following [26], we refer to semantics as visual concepts, e.g., a semantic-ware model indicates the model can perceive visual concepts, and the learned features are efficient in object recognition, detection, etc., we propose a novel $AE$ that could learn semantic-aware representation via cross-view image reconstruction. We use one view of an image as the input and another view of the same image as the reconstruction target. This kind of $AE$ has rarely been studied before, and the optimization is very difficult. To enhance learning ability and find a feasible solution, we propose a semantic aligner that uses geometric transformation knowledge to align the hidden code of $AE$ to help optimization. These techniques significantly improve the representation learning ability of $AE$ and make selfsupervised learning with $\mathcal{G}$ models possible. Extensive experiments on many large-scale benchmarks (e.g., ImageNet, COCO 2017, and SYSU-30k) demonstrate the effectiveness of our methods. Code is available at https://github.com/wanggrun/Semantic-Aware-AE.

Publication status:: Published

Peer review status:: Peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Cite

Cite this record

APA Style

Wang, G., Tang, Y., Lin, L., & Torr, P. H. S. (2022). Semantic-aware auto-encoders for self-supervised representation learning. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 9654–9665.

MLA Style

Wang, G., et al. “Semantic-Aware Auto-Encoders for Self-Supervised Representation Learning.” 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2022, pp. 9654–65.

Chicago Style

Wang, G, Y Tang, L Lin, and PHS Torr. 2022. “Semantic-Aware Auto-Encoders for Self-Supervised Representation Learning.” In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 9654–65. IEEE.
Share
Print

Access Document

Files:: Wang_et_al_2022_Semantic_aware_auto.pdf

(Preview, Accepted manuscript, pdf, 1.3MB, Terms of use)

Publisher copy:: 10.1109/cvpr52688.2022.00944

Authors

+ Wang, G More by this author

Role:: Author

+ Tang, Y More by this author

Role:: Author

+ Lin, L More by this author

Role:: Author

+ Torr, PHS More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Engineering Science
Role:: Author

Publisher:: IEEE
Host title:: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Pages:: 9654-9665
Publication date:: 2022-09-27
Acceptance date:: 2022-03-02
Event title:: IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR 2022)
Event location:: New Orleans, Louisiana
Event website:: https://cvpr2022.thecvf.com/
Event start date:: 2022-06-21
Event end date:: 2022-06-24
DOI:: 10.1109/cvpr52688.2022.00944
EISSN:: 2575-7075
ISSN:: 1063-6919
EISBN:: 9781665469463
ISBN:: 9781665469470

Language:: English
Keywords:: FFR
Pubs id:: 1304012
Local pid:: pubs:1304012
Deposit date:: 2022-11-14

Terms of use

Copyright holder:: IEEE
Notes:: This is the accepted manuscript version of the paper. The final version is available online from IEEE at: https://doi.org/10.1109/CVPR52688.2022.00944

Licence:: Terms and Conditions of Use for Oxford University Research Archive

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Conference item

Semantic-aware auto-encoders for self-supervised representation learning

Actions

Access Document

Authors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Conference item

Semantic-aware auto-encoders for self-supervised representation learning

Actions

Access Document

Authors

Bibliographic Details

Item Description

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions