Conference item
On pretraining data diversity for self-supervised learning
- Abstract:
- We explore the impact of training with more diverse datasets, characterized by the number of unique samples, on the performance of self-supervised learning (SSL) under a fixed computational budget. Our findings consistently demonstrate that increasing pretraining data diversity enhances SSL performance, albeit only when the distribution distance to the downstream data is minimal. Notably, even with an exceptionally large pretraining data diversity achieved through methods like web crawling or diffusion-generated data, among other ways, the distribution shift remains a challenge. Our experiments are comprehensive with seven SSL methods using large-scale datasets such as ImageNet and YFCC100M amounting to over 200 GPU days. The code and trained models will be available at https: //github.com/hammoudhasan/DiversitySSL.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Accepted manuscript, pdf, 2.7MB, Terms of use)
-
- Publisher copy:
- 10.1007/978-3-031-72992-8_4
Authors
- Publisher:
- Springer
- Host title:
- Computer Vision – ECCV 2024 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part LVI
- Pages:
- 54–71
- Series:
- Lecture Notes in Computer Science
- Series number:
- 15114
- Publication date:
- 2024-10-30
- Acceptance date:
- 2024-02-26
- Event title:
- 18th European Conference on Computer Vision (ECCV 2024)
- Event location:
- Seattle, WA, USA
- Event website:
- https://cvpr.thecvf.com/
- Event start date:
- 2024-06-17
- Event end date:
- 2024-06-21
- DOI:
- EISSN:
-
1611-3349
- ISSN:
-
0302-9743
- EISBN:
- 978-3-031-72992-8
- ISBN:
- 978-3-031-72991-1
- Language:
-
English
- Pubs id:
-
2013485
- Local pid:
-
pubs:2013485
- Deposit date:
-
2024-07-10
Terms of use
- Copyright holder:
- Hammoud et al.
- Copyright date:
- 2024
- Rights statement:
- © 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
- Notes:
- This paper was presented at the 18th European Conference on Computer Vision (ECCV 2024). 17th-21st June 2024, Seattle, WA, USA. This is the accepted manuscript version of the article. The final version is available online from Springer at https://dx.doi.org/10.1007/978-3-031-72992-8_4
If you are the owner of this record, you can report an update to it here: Report update to this record