Conference item icon

Conference item : Poster

An exactly solvable model for emergence and scaling laws in the multitask sparse parity problem

Abstract:
Deep learning models can exhibit what appears to be a sudden ability to solve a new problem as training time, training data, or model size increases, a phenomenon known as emergence. In this paper, we present a framework where each new ability (a skill) is represented as a basis function. We solve a simple multi-linear model in this skill-basis, finding analytic expressions for the emergence of new skills, as well as for scaling laws of the loss with training time, data size, model size, and optimal compute. We compare our detailed calculations to direct simulations of a two-layer neural network trained on multitask sparse parity, where the tasks in the dataset are distributed according to a power-law. Our simple model captures, using a single fit parameter, the sigmoidal emergence of multiple new skills as training time, data size or model size increases in the neural network.
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Physics
Sub department:
Theoretical Physics
Role:
Author
ORCID:
0000-0001-9314-8976
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Physics
Sub department:
Theoretical Physics
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Physics
Sub department:
Theoretical Physics
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Physics
Sub department:
Theoretical Physics
Oxford college:
Worcester College
Role:
Author
ORCID:
0000-0002-8438-910X


Publisher:
Curran Associates
Host title:
Advances in Neural Information Processing Systems 37 (NeurIPS 2024)
Volume:
37
Publication date:
2024-09-25
Acceptance date:
2024-11-06
Event title:
38th Annual Conference on Neural Information Processing Systems (NeurIPS 2024)
Event location:
Vancouver, BC, Canada
Event website:
http://neurips.cc/Conferences/2024
Event start date:
2024-12-10
Event end date:
2024-12-15
ISSN:
1049-5258


Language:
English
Subtype:
Poster
Pubs id:
2101667
Local pid:
pubs:2101667
Deposit date:
2025-06-09

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP