Thesis icon

Thesis

Unsupervised learning and continual learning in neural networks

Abstract:

For decades research has pursued the ambitious goal of designing computer models that learn to solve problems as effectively as humans can. Artificial neural networks -- generic, optimizable models that were originally inspired by biological neurons in the brain -- appear to provide a promising answer. However, a significant limitation of current models is that they tend to only be reliably proficient at tasks and datasets that they were explicitly trained on. If more than one task or dataset is being trained on, samples need to be appropriately mixed and balanced so that training on successive batches does not induce forgetting of knowledge learned in previous batches, which is an impediment to continual learning. Furthermore, associations need to be made explicit via paired input-target samples for the trained network to achieve its best performance on desired tasks; when the network is trained in an unsupervised manner without explicit targets, in an effort to reduce the cost of data collection, knowledge learned by the network transfers to desired tasks significantly worse compared to supervised training with explicit associations.

Each of these problems relates to the fundamental issue of generalization, which is the ability to perform well despite novelty. In chapter 2, we discuss conditions under which good generalization can be expected to arise, including small model size and similarity between training and test data in supervised, unsupervised and continual learning contexts. Chapter 3 proposes a method for predicting when a model does not generalize to a test sample, deriving generalization bounds that quantify predictive reliability using both model size and similarity with training data. Chapter 4 presents a clustering method that learns how to approximately separate data between semantic concepts with an unsupervised objective that does not use manual labels. Chapter 5 contains a method for performing the task of object localization without specialized training data, by repurposing saliency maps. Chapter 6 contains a continual learning method where the model is forced to reconsider previously held knowledge concurrent with new knowledge, and chapter 7 uses a dynamic architecture to suppress interference from new learning episodes on old knowledge.

Without solutions to these generalization problems, neural networks cannot learn effectively in real time from naturally sequential and un-annotated real world data, which limits their deployment options. Generalization is therefore a problem with immense practical implications, as well as being interesting theoretically and from the perspective of biologically-inspired learning.

Actions


Access Document


Files:

Authors


More by this author
Division:
MPLS
Department:
Engineering Science
Role:
Author

Contributors

Role:
Supervisor
Role:
Supervisor


More from this funder
Funding agency for:
Ji, X
Grant:
1753489
Programme:
EPSRC CDT in Autonomous Intelligent Machines and Systems


DOI:
Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
University of Oxford

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP