Journal article
Learning from docked ligands: ligand-based features rescue structure-based scoring functions when trained on docked poses
- Abstract:
- Machine learning scoring functions for protein–ligand binding affinity have been found to consistently outperform classical scoring functions when trained and tested on crystal structures of bound protein–ligand complexes. However, it is less clear how these methods perform when applied to docked poses of complexes. We explore how the use of docked rather than crystallographic poses for both training and testing affects the performance of machine learning scoring functions. Using the PDBbind Core Sets as benchmarks, we show that the performance of a structure-based machine learning scoring function trained and tested on docked poses is lower than that of the same scoring function trained and tested on crystallographic poses. We construct a hybrid scoring function by combining both structure-based and ligand-based features, and show that its ability to predict binding affinity using docked poses is comparable to that of purely structure-based scoring functions trained and tested on crystal poses. We also present a new, freely available validation set—the Updated DUD-E Diverse Subset—for binding affinity prediction using data from DUD-E and ChEMBL. Despite strong performance on docked poses of the PDBbind Core Sets, we find that our hybrid scoring function sometimes generalizes poorly to a protein target not represented in the training set, demonstrating the need for improved scoring functions and additional validation benchmarks.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Accepted manuscript, pdf, 876.6KB, Terms of use)
-
- Publisher copy:
- 10.1021/acs.jcim.1c00096
Authors
- Publisher:
- American Chemical Society
- Journal:
- Journal of Chemical Information and Modeling More from this journal
- Volume:
- 62
- Issue:
- 22
- Pages:
- 5329–5341
- Publication date:
- 2021-09-01
- Acceptance date:
- 2021-07-12
- DOI:
- EISSN:
-
1549-960X
- ISSN:
-
1549-9596
- Language:
-
English
- Keywords:
- Pubs id:
-
1159210
- Local pid:
-
pubs:1159210
- Deposit date:
-
2021-07-14
- ARK identifier:
Terms of use
- Copyright holder:
- Boyles et al.
- Copyright date:
- 2021
- Rights statement:
- © 2021 The Authors. Published by American Chemical Society.
- Notes:
- This is the accepted manuscript version of the article. The final version is available online from American Chemical Society at: https://doi.org/10.1021/acs.jcim.1c00096
If you are the owner of this record, you can report an update to it here: Report update to this record