Conference item icon

Conference item

Drug discovery under covariate shift with domain-informed prior distributions over functions

Abstract:
Accelerating the discovery of novel and more effective therapeutics is an important pharmaceutical problem in which deep learning is playing an increasingly significant role. However, real-world drug discovery tasks are often characterized by a scarcity of labeled data and significant covariate shift—a setting that poses a challenge to standard deep learning methods. In this paper, we present Q-SAVI, a probabilistic model able to address these challenges by encoding explicit prior knowledge of the data-generating process into a prior distribution over functions, presenting researchers with a transparent and probabilistically principled way to encode data-driven modeling preferences. Building on a novel, gold-standard bioactivity dataset that facilitates a meaningful comparison of models in an extrapolative regime, we explore different approaches to induce data shift and construct a challenging evaluation setup. We then demonstrate that using Q-SAVI to integrate contextualized prior knowledge of drug-like chemical space into the modeling process affords substantial gains in predictive accuracy and calibration, outperforming a broad range of state-of-the-art self-supervised pre-training and domain adaptation techniques.
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Files:
Publication website:
https://proceedings.mlr.press/v202/klarner23a.html

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Statistics
Oxford college:
Green Templeton College;Green Templeton College;Green Templeton College;Green Templeton College;Green Templeton College;Green Templeton College
Role:
Author
ORCID:
0000-0003-1731-8405

Contributors

Role:
Editor
Role:
Editor
Role:
Editor
Role:
Editor
Role:
Editor


Publisher:
Journal of Machine Learning Research
Volume:
202
Pages:
17176-17197
Series:
Proceedings of Machine Learning Research
Publication date:
2023-08-31
Acceptance date:
2023-06-14
Event title:
40th International Conference on Machine Learning (ICML 2023)
Event location:
Honolulu, Hawaii, USA
Event website:
https://icml.cc/Conferences/2023/Dates
Event start date:
2023-07-23
Event end date:
2023-07-29
ISSN:
2640-3498


Language:
English
Keywords:
Pubs id:
1493157
Local pid:
pubs:1493157
Deposit date:
2023-07-18
ARK identifier:

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP