Thesis
Explaining black box algorithms: epistemological challenges and machine learning solutions
- Abstract:
-
This dissertation seeks to clarify and resolve a number of fundamental issues surrounding algorithmic explainability. What constitutes a satisfactory explanation of a supervised learning model or prediction? What are the basic units of explanation and how do they vary across agents and contexts? Can reliable methods be designed to generate model-agnostic algorithmic explanations? I tackle these questions over the course of eight chapters, examining existing work in interpretable machine learning (iML), developing a novel theoretical framework for comparing and developing iML solutions, and ultimately implementing a number of new algorithms that deliver global and local explanations with statistical guarantees. At each turn, I emphasise three crucial desiderata: algorithmic explanations must be causal, pragmatic, and severely tested.
In Chapter 1, I introduce the topic through real world examples that vividly demonstrate the ethical and epistemological imperative to better understand the behaviour of black box models. A literature review follows in Chapters 2 and 3, where I situate the project at the intersection of critical data studies, philosophy of information, and computational statistics. In Chapter 4, I examine conceptual challenges for iML that result in misleading, counterintuitive explanations. In Chapter 5, I propose a formal framework for iML – the explanation game – in which players collaborate to find the best solution(s) to explanatory questions through a gradual procedure of iterative refinements. In Chapter 6, I introduce a novel test of conditional independence that doubles as a flexible measure of global variable importance. In Chapter 7, I combine feature attributions and counterfactuals into a single method that retains and extends the axiomatic guarantees of Shapley values while rationalising results for agents with well-defined preferences and beliefs. I conclude in Chapter 8 with a review of my results and a discussion of their significance for data scientists, policymakers, and end users.
Actions
Authors
Contributors
- Institution:
- University of Oxford
- Division:
- SSD
- Department:
- Oxford Internet Institute
- Sub department:
- Oxford Internet Institute
- Role:
- Supervisor
- ORCID:
- 0000-0002-5444-2280
- Institution:
- University College London
- Role:
- Supervisor
- Institution:
- Caltech
- Role:
- Examiner
- Institution:
- University of Oxford
- Division:
- SSD
- Department:
- Oxford Internet Institute
- Sub department:
- Oxford Internet Institute
- Role:
- Examiner
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Subjects:
- Deposit date:
-
2021-04-23
If you are the owner of this record, you can report an update to it here: Report update to this record