Unsupervised learning and continual learning in neural networks

Ji, X

Thesis

Unsupervised learning and continual learning in neural networks

Abstract:: For decades research has pursued the ambitious goal of designing computer models that learn to solve problems as effectively as humans can. Artificial neural networks -- generic, optimizable models that were originally inspired by biological neurons in the brain -- appear to provide a promising answer. However, a significant limitation of current models is that they tend to only be reliably proficient at tasks and datasets that they were explicitly trained on. If more than one task or dataset is being trained on, samples need to be appropriately mixed and balanced so that training on successive batches does not induce forgetting of knowledge learned in previous batches, which is an impediment to continual learning. Furthermore, associations need to be made explicit via paired input-target samples for the trained network to achieve its best performance on desired tasks; when the network is trained in an unsupervised manner without explicit targets, in an effort to reduce the cost of data collection, knowledge learned by the network transfers to desired tasks significantly worse compared to supervised training with explicit associations.

Each of these problems relates to the fundamental issue of generalization, which is the ability to perform well despite novelty. In chapter 2, we discuss conditions under which good generalization can be expected to arise, including small model size and similarity between training and test data in supervised, unsupervised and continual learning contexts. Chapter 3 proposes a method for predicting when a model does not generalize to a test sample, deriving generalization bounds that quantify predictive reliability using both model size and similarity with training data. Chapter 4 presents a clustering method that learns how to approximately separate data between semantic concepts with an unsupervised objective that does not use manual labels. Chapter 5 contains a method for performing the task of object localization without specialized training data, by repurposing saliency maps. Chapter 6 contains a continual learning method where the model is forced to reconsider previously held knowledge concurrent with new knowledge, and chapter 7 uses a dynamic architecture to suppress interference from new learning episodes on old knowledge.

Without solutions to these generalization problems, neural networks cannot learn effectively in real time from naturally sequential and un-annotated real world data, which limits their deployment options. Generalization is therefore a problem with immense practical implications, as well as being interesting theoretically and from the perspective of biologically-inspired learning.

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Cite

Cite this record

APA Style

Ji, X. (2021). Unsupervised learning and continual learning in neural networks [PhD thesis]. University of Oxford.

MLA Style

Ji, X. Unsupervised Learning and Continual Learning in Neural Networks. University of Oxford, 2021.

Chicago Style

Ji, X. 2021. “Unsupervised Learning and Continual Learning in Neural Networks.” PhD thesis, University of Oxford.
Share
Print