Toward efficient deep learning with sparse neural networks

Lee, N

Thesis

Toward efficient deep learning with sparse neural networks

Abstract:: Despite the tremendous success that deep learning has achieved in recent years, it remains challenging to deal with the excessive computational and memory cost involved in executing deep learning based applications. To address the challenge, this thesis focuses on studying sparse neural networks, particularly around their construction, initialization, and large-scale training aspects, as an attempt to take a step toward efficient deep learning.

Firstly, this thesis addresses the problem of finding sparse neural networks by pruning. Network pruning is an effective methodology to sparsify neural networks, and yet, existing approaches often introduce hyperparameters that either need to be tuned with expert knowledge or are based on ad-hoc intuitions, and typically entails iterative training steps. Alternatively, this thesis begins with proposing an efficient pruning method that is applied to a neural network prior to training in a single shot. The obtained sparse neural network using this method, once trained, exhibit state-of-the-art performance on various image classification tasks.

Albeit efficient, it remains unclear exactly why this approach of pruning at initialization can be effective. This thesis then extends this method by developing a new perspective, from which the problem of finding trainable sparse neural networks is approached based on network initialization. Being a key to the success of finding and training sparse neural networks, this thesis proposes a sufficient initialization condition that can be easily satisfied with a simple optimization step and, once achieved, accelerates training sparse neural networks quite significantly.

While sparse neural networks can be obtained by pruning at initialization, there has been little study concerning the subsequent training of these sparse networks. This thesis lastly concentrates on studying data parallelism -- a straightforward approach to speed up neural network training by parallelizing it using a distributed computing system -- under the influence of sparsity. To this end, the effects of data parallelism and sparsity are first measured accurately based on extensive experiments which are accompanied by metaparameter search. Then, this thesis establishes theoretical results that precisely account for these effects, which have only been addressed partially and empirically and thus remained as debatable.

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Cite

Cite this record

APA Style

Lee, N. (2020). Toward efficient deep learning with sparse neural networks [PhD thesis]. University of Oxford.

MLA Style

Lee, N. Toward Efficient Deep Learning with Sparse Neural Networks. University of Oxford, 2020.

Chicago Style

Lee, N. 2020. “Toward Efficient Deep Learning with Sparse Neural Networks.” PhD thesis, University of Oxford.
Share
Print

Access Document

Files:: thesis-namhoonlee-updated.pdf

(Preview, Version of record, pdf, 11.6MB, Terms of use)

Authors

+ Lee, N More by this author

Division:: MPLS
Department:: Engineering Science
Role:: Author

Contributors

+ Torr, PHS

Role:: Supervisor

DOI:: 10.5287/ora-kq1p0n8vd
Type of award:: DPhil
Level of award:: Doctoral
Awarding institution:: University of Oxford

Language:: English
Keywords:: pruning at initialization

sparse neural networks

signal propagation perspective

data parallelism
Subjects:: Deep learning
Pubs id:: 2044869
Local pid:: pubs:2044869
Deposit date:: 2020-11-21

Title:: Toward efficient deep learning with sparse neural networks
DOI:: 10.5287/ora-kq1p0n8vd-2 Request object version
Created date:: 2024-12-02

Title:: Toward efficient deep learning with sparse neural networks
DOI:: 10.5287/ora-kq1p0n8vd-1 Request object version
Created date:: 2024-12-02

Terms of use

Copyright holder:: Lee, N

Licence:: Terms and Conditions of Use for Oxford University Research Archive

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Thesis

Toward efficient deep learning with sparse neural networks

Actions

Access Document

Authors

Contributors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Thesis

Toward efficient deep learning with sparse neural networks

Actions

Access Document

Authors

Contributors

Bibliographic Details

Item Description

Versions

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions