Preprint icon

Preprint

Democratising clinical AI through dataset condensation for classical clinical models

Abstract:
Dataset condensation (DC) learns a compact synthetic dataset that enables models to match the performance of full-data training, prioritising utility over distributional fidelity. While typically explored for computational efficiency, DC also holds promise for healthcare data democratisation, especially when paired with differential privacy, allowing synthetic data to serve as a safe alternative to real records. However, existing DC methods rely on differentiable neural networks, limiting their compatibility with widely used clinical models such as decision trees and Cox regression. We address this gap using a differentially private, zero-order optimisation framework that extends DC to non-differentiable models using only function evaluations. Empirical results across six datasets, including both classification and survival tasks, show that the proposed method produces condensed datasets that preserve model utility while providing effective differential privacy guarantees—enabling model-agnostic data sharing for clinical prediction tasks without exposing sensitive patient information.
Publication status:
Published
Peer review status:
Not peer reviewed

Actions

Access Document

Preprint server copy:
10.48550/arXiv.2603.09356

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
ORCID:
0000-0002-7006-1947
More by this author
Institution:
University of Oxford
Division:
MSD
Department:
NDM
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Oxford college:
Jesus College
Role:
Author
ORCID:
0000-0002-3116-218X
More by this author
Institution:
University of Oxford
Division:
MSD
Department:
Oncology
Role:
Author
ORCID:
0000-0003-2391-5361


More from this funder
Funder identifier:
https://ror.org/029chgv08
More from this funder
Funder identifier:
https://ror.org/052gg0110
More from this funder
Funder identifier:
https://ror.org/001aqnf71
More from this funder
Funder identifier:
https://ror.org/0526snb40


Preprint server:
arXiv
Publication date:
2026-03-10
DOI:


Language:
English
Pubs id:
2393327
Local pid:
pubs:2393327
Deposit date:
2026-05-13
ARK identifier:

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP