Journal article icon

Journal article

Narrowing the gap between machine learning scoring functions and free energy perturbation using augmented data

Abstract:
Machine learning offers great promise for fast and accurate binding affinity predictions. However, current models lack robust evaluation and fail on tasks encountered in (hit-to-) lead optimisation, such as ranking the binding affinity of a congeneric series of ligands, thereby limiting their application in drug discovery. Here, we address these issues by first introducing a novel attention-based graph neural network model called AEV-PLIG (atomic environment vector–protein ligand interaction graph). Second, we introduce a new and more realistic out-of-distribution test set called the OOD Test. We benchmark our model on this set, CASF-2016, and a test set used for free energy perturbation (FEP) calculations, that not only highlights the competitive performance of AEV-PLIG, but provides a realistic assessment of machine learning models with rigorous physics-based approaches. Moreover, we demonstrate how leveraging augmented data (generated using template-based modelling or molecular docking) can significantly improve binding affinity prediction correlation and ranking on the FEP benchmark (weighted mean PCC and Kendall’s τ increases from 0.41 and 0.26 to 0.59 and 0.42). These strategies together are closing the performance gap with FEP calculations (FEP+ achieves weighted mean PCC and Kendall’s τ of 0.68 and 0.49 on the FEP benchmark) while being ~400,000 times faster.
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Publisher copy:
10.1038/s42004-025-01428-y

Authors


More by this author
Institution:
University of Oxford
Role:
Author
ORCID:
0009-0004-1309-1076
More by this author
Institution:
University of Oxford
Role:
Author
ORCID:
0000-0002-5860-7468
More by this author
Institution:
University of Oxford
Role:
Author
ORCID:
0000-0003-1388-2252
More by this author
Role:
Author
ORCID:
0000-0003-3385-964X
More by this author
Institution:
University of Oxford
Role:
Author
ORCID:
0000-0003-1731-8405


Publisher:
Nature Research
Journal:
Communications Chemistry More from this journal
Volume:
8
Issue:
1
Article number:
41
Publication date:
2025-02-08
Acceptance date:
2025-01-23
DOI:
EISSN:
2399-3669


Language:
English
Source identifiers:
2670517
Deposit date:
2025-02-09
This ORA record was generated from metadata provided by an external service. It has not been edited by the ORA Team.

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP