Thesis icon

Thesis

Understanding convolutional neural networks

Abstract:

In the past decade, deep learning has fueled a number of exciting developments in artificial intelligence (AI). However, as deep learning is increasingly being applied to high-impact domains, like medical diagnosis or autonomous driving, the impact of its failures also increases. Because of their high complexity (i.e. they are typically composed of millions of parameters), deep learning models are difficult to interpret. Thus, there is a great need for tools that help us understand how such models make their decisions. In this thesis, we introduce several methods for understanding convolutional neural networks (CNNs), the class of deep learning models that is typically applied to visual data — i.e. images and videos. Our techniques span three approaches to understanding a model: 1. describing the relationship between its inputs and outputs; 2. characterizing correlations between a model’s inputs and its internal representation; and 3. using visualization tools to easily and efficiently explore aspects of a model.

First, we tackle the attribution problem of identifying the parts of a model’s input (i.e. image regions) that are most responsible for its output decision. We present two techniques — meaningful perturbations and extremal perturbations – which work by perturbing the input image and learning the regions that when edited out, most affect the model’s prediction. Second, we seek to understand how semantic concepts, from different kinds of textures to various kinds of objects, are recognized by network parameters (a.k.a. neurons). We introduce Net2Vec, a novel paradigm that reveals how combinations of internal neurons encode specific concepts. Lastly, similar to how a stethoscope is used to explore the internal behavior of different parts of the body, we introduce a novel visualization technique — interactive similarity overlays — that allows an AI researcher or developer to quickly and easily explore the internal representation of a model. Together, these methods enable us to scientifically understand the external behavior of CNNs as well as their inner workings.

Actions


Access Document


Files:

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Sub department:
Engineering Science
Research group:
Visual Geometry Group
Oxford college:
St John's College
Role:
Author
ORCID:
0000-0001-8831-6402

Contributors

Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Sub department:
Engineering Science
Research group:
Visual Geometry Group
Oxford college:
New College
Role:
Supervisor
ORCID:
0000-0003-1374-2858
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Sub department:
Engineering Science
Research group:
Visual Geometry Group
Role:
Examiner
ORCID:
0000-0002-8945-8573
Institution:
Massachusetts Institute of Technology
Role:
Examiner
ORCID:
0000-0003-4915-0256


More from this funder
Funder identifier:
http://dx.doi.org/10.13039/100014895
Funding agency for:
Fong, R
Programme:
Rhodes Scholarship
More from this funder
Funding agency for:
Fong, R
Programme:
Open Phil AI Fellowship


Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
University of Oxford

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP