Thesis
Exploring protein folding mechanisms to enable the protein structure prediction of previously intractable targets
- Abstract:
-
The prediction of a protein structure from its amino acid sequence is one of the grand challenges of computational biology. In this thesis, we describe extensions to our biologically-inspired fragment-based protein structure prediction pipeline, SAINT2, to improve the prediction of previously intractable targets.
While template-free protein structure prediction protocols can now produce good quality models for many targets, modelling failure remains common; users need to be able to identify if a good model exists among the many models produced for a given target. We describe Random Forest Quality Assessment (RFQAmodel), which assesses whether models produced by a protein structure prediction pipeline have the correct fold. By iteratively generating models and running RFQAmodel until a model is produced that is predicted to be correct with high confidence, we demonstrate how such a protocol can be used to focus computational efforts on difficult modelling targets.
It is common for a target of interest to have an incomplete structure or a partial homologous template. We describe Flib-Flex, a method for incorporating known structural information into the fragment library that is used during prediction, and its combination with SAINT2-ScafFold to complete partial protein structures. We demonstrate that the missing regions can be modelled accurately in the presence of the known structure, and correct models can be identified using a modified version of RFQAmodel.
Long proteins remain a challenge for template-free protein structure prediction. There is experimental evidence that proteins can adopt the native conformation via the sequential folding of small units known as foldons. Inspired by this folding pathway hypothesis, we have divided long protein structures into smaller, semi-stable segments, and predict these individually in succession. This protocol, ScafFoldOn, enables the prediction of previously intractable targets.
Actions
Authors
Contributors
- Institution:
- University of Oxford
- Division:
- MPLS
- Department:
- Statistics
- Research group:
- Oxford Protein Informatics Group
- Role:
- Supervisor
- ORCID:
- 0000-0003-1388-2252
- Institution:
- UCB Pharma
- Role:
- Supervisor
- Institution:
- UCB Pharma
- Role:
- Supervisor
- Role:
- Supervisor
- Funding agency for:
- West, C
- Grant:
- EP/G037280/1
- Programme:
- EPSRC and MRC Systems Approaches to Biomedical Sciences
- Funding agency for:
- West, C
- Grant:
- EP/G037280/1
- Programme:
- EPSRC and MRC Systems Approaches to Biomedical Sciences
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Subjects:
- Deposit date:
-
2020-12-30
Terms of use
- Copyright holder:
- West, C
- Copyright date:
- 2019
If you are the owner of this record, you can report an update to it here: Report update to this record