Over-parameterised shallow neural networks with asymmetrical node scaling: global convergence guarantees and feature learning

Caron, F; Ayed, F; Jung, P; Lee, H; Lee, J; Yang, H

AI Collection

Journal article

Over-parameterised shallow neural networks with asymmetrical node scaling: global convergence guarantees and feature learning

Abstract:: We consider gradient-based optimisation of wide, shallow neural networks, where the output of each hidden node is scaled by a positive parameter. The scaling parameters are non-identical, differing from the classical Neural Tangent Kernel (NTK) parameterisation. We prove that for large such neural networks, with high probability, gradient flow and gradient descent converge to a global minimum and can learn features in some sense, unlike in the NTK parameterisation. We perform experiments illustrating our theoretical results and discuss the benefits of such scaling in terms of prunability and transfer learning.

Publication status:: Published

Peer review status:: Peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Caron, F., Ayed, F., Jung, P., Lee, H., Lee, J., & Yang, H. (2025). Over-parameterised shallow neural networks with asymmetrical node scaling: global convergence guarantees and feature learning. Transactions on Machine Learning Research, 2025(2).

MLA Style

Caron, F, et al. “Over-Parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning.” Transactions on Machine Learning Research, vol. 2025, no. 2, 2025.

Chicago Style

Caron, F, F Ayed, P Jung, H Lee, J Lee, and H Yang. 2025. “Over-Parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning.” Transactions on Machine Learning Research 2025 (2).
Print

Access Document

Files:: Caron_et_al_2025_Over-parameterised_shallow_neural.pdf

(Preview, Version of record, pdf, 6.9MB, Terms of use)

Publication website:: https://openreview.net/forum?id=Sx1khIIi95

Authors

+ Caron, F More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Statistics
Oxford college:: Keble College
Role:: Author
ORCID:: 0000-0002-3952-224X

+ Ayed, F More by this author

Role:: Author

+ Jung, P More by this author

Role:: Author

+ Lee, H More by this author

Role:: Author

+ Lee, J More by this author

Role:: Author

More authors...

Publisher:: Journal of Machine Learning Research
Journal:: Transactions on Machine Learning Research More from this journal
Volume:: 2025
Issue:: 2
Publication date:: 2025-02-18
Acceptance date:: 2025-02-10
EISSN:: 2835-8856

Language:: English
Pubs id:: 2085618
Local pid:: pubs:2085618
Deposit date:: 2025-02-13
ARK identifier:: ark:/29072/ora_ae84902a75e9414a8c5588d95a3eea1b

Terms of use

Copyright holder:: Caron et al
Rights statement:: ©2025 The Authors. This paper is an open access article distributed under the terms of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/)

Licence:: CC Attribution (CC BY)

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Journal article

Over-parameterised shallow neural networks with asymmetrical node scaling: global convergence guarantees and feature learning

Actions

Access Document

Authors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Journal article

Over-parameterised shallow neural networks with asymmetrical node scaling: global convergence guarantees and feature learning

Actions

Access Document

Authors

Bibliographic Details

Item Description

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions