Thesis icon

Thesis

Deep learning approaches for pre-clinical drug discovery

Abstract:

Deep learning methods have experienced a revolution, driven by their successful application in fields such as computer vision and natural language processing. In this thesis, we describe several novel methodologies leveraging deep learning for applications to pre-clinical drug discovery.

First, we propose a generative approach to the design of molecular linkers which incorporates basic 3D information. In large-scale tests, we find that our method substantially outperforms a database-based approach, the previous de facto approach for this problem. Through a series of case studies, we demonstrate the application of our approach to scaffold hopping, fragment linking and PROTAC design. We then extend this framework to incorporate physically meaningful 3D structural information, providing a richer prior for the generative process, and also apply our method to molecular elaboration tasks, such as R-group design.

We then turn our attention to predictive modelling, in particular structure-based virtual screening. We find that the advances in convolutional neural networks (CNNs) for general computer vision tasks are applicable to structure based virtual screening. In addition, we propose two techniques to incorporate domain-specific knowledge into this framework. First, we show that limitations in docking necessitate the use of multi-pose scoring and demonstrate the benefits of an average scoring policy. Second, we propose a transfer learning approach to construct protein family specific models, utilising knowledge of the differences between protein families.

Finally, we investigate how a generative approach can be used to improve the training and benchmarks sets employed in structure-based virtual screening. We propose a deep learning method that generates decoys to a user’s preferred specification in order to control decoy bias or construct sets with a defined bias. We show that our approach significantly reduces the bias contained in such sets. We validate that our generated molecules are more challenging for docking-based approaches to separate from bioactive compounds than previous decoys. In addition, we show that CNN-based structure-based virtual screening methods can be trained on such compounds.

Actions


Access Document


Files:

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Statistics
Research group:
Oxford Protein Informatics Group
Oxford college:
Keble College
Role:
Author
ORCID:
0000-0002-6241-0123

Contributors

Institution:
University of Cambridge
Role:
Supervisor
Institution:
Exscientia Ltd
Role:
Supervisor
Institution:
University of Oxford
Division:
MPLS
Department:
Statistics
Research group:
Oxford Protein Informatics Group
Role:
Supervisor
ORCID:
0000-0003-1388-2252


More from this funder
Funding agency for:
Imrie, FM
Grant:
EP/N509711/1
More from this funder
Funding agency for:
Imrie, FM


DOI:
Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
University of Oxford


Language:
English
Keywords:
Subjects:
Pubs id:
2044948
Local pid:
pubs:2044948
Deposit date:
2021-04-01

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP