Thesis icon

Thesis

Exploring protein folding mechanisms to enable the protein structure prediction of previously intractable targets

Abstract:

The prediction of a protein structure from its amino acid sequence is one of the grand challenges of computational biology. In this thesis, we describe extensions to our biologically-inspired fragment-based protein structure prediction pipeline, SAINT2, to improve the prediction of previously intractable targets.

While template-free protein structure prediction protocols can now produce good quality models for many targets, modelling failure remains common; users need to be able to identify if a good model exists among the many models produced for a given target. We describe Random Forest Quality Assessment (RFQAmodel), which assesses whether models produced by a protein structure prediction pipeline have the correct fold. By iteratively generating models and running RFQAmodel until a model is produced that is predicted to be correct with high confidence, we demonstrate how such a protocol can be used to focus computational efforts on difficult modelling targets.

It is common for a target of interest to have an incomplete structure or a partial homologous template. We describe Flib-Flex, a method for incorporating known structural information into the fragment library that is used during prediction, and its combination with SAINT2-ScafFold to complete partial protein structures. We demonstrate that the missing regions can be modelled accurately in the presence of the known structure, and correct models can be identified using a modified version of RFQAmodel.

Long proteins remain a challenge for template-free protein structure prediction. There is experimental evidence that proteins can adopt the native conformation via the sequential folding of small units known as foldons. Inspired by this folding pathway hypothesis, we have divided long protein structures into smaller, semi-stable segments, and predict these individually in succession. This protocol, ScafFoldOn, enables the prediction of previously intractable targets.

Actions


Access Document


Files:

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Statistics
Research group:
Oxford Protein Informatics Group
Oxford college:
Lincoln College
Role:
Author
ORCID:
0000-0001-5677-8215

Contributors

Institution:
University of Oxford
Division:
MPLS
Department:
Statistics
Research group:
Oxford Protein Informatics Group
Role:
Supervisor
ORCID:
0000-0003-1388-2252
Institution:
UCB Pharma
Role:
Supervisor
Institution:
UCB Pharma
Role:
Supervisor
Role:
Supervisor


More from this funder
Funding agency for:
West, C
Grant:
EP/G037280/1
Programme:
EPSRC and MRC Systems Approaches to Biomedical Sciences
More from this funder
Funding agency for:
West, C
Grant:
EP/G037280/1
Programme:
EPSRC and MRC Systems Approaches to Biomedical Sciences
More from this funder
Funding agency for:
West, C


Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
University of Oxford


Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP