Journal article icon

Journal article

Generating synthetic mixed-type longitudinal electronic health records for artificial intelligent applications

Abstract:

The recent availability of electronic health records (EHRs) have provided enormous opportunities to develop artificial intelligence (AI) algorithms. However, patient privacy has become a major concern that limits data sharing across hospital settings and subsequently hinders the advances in AI. Synthetic data, which benefits from the development and proliferation of generative models, has served as a promising substitute for real patient EHR data. However, the current generative models are limited as they only generate single type of clinical data for a synthetic patient, i.e., either continuous-valued or discrete-valued. To mimic the nature of clinical decision-making which encompasses various data types/sources, in this study, we propose a generative adversarial network (GAN) entitled EHR-M-GAN that simultaneously synthesizes mixed-type timeseries EHR data. EHR-M-GAN is capable of capturing the multidimensional, heterogeneous, and correlated temporal dynamics in patient trajectories. We have validated EHR-M-GAN on three publicly-available intensive care unit databases with records from a total of 141,488 unique patients, and performed privacy risk evaluation of the proposed model. EHR-M-GAN has demonstrated its superiority over state-of-the-art benchmarks for synthesizing clinical timeseries with high fidelity, while addressing the limitations regarding data types and dimensionality in the current generative models. Notably, prediction models for outcomes of intensive care performed significantly better when training data was augmented with the addition of EHR-M-GAN-generated timeseries. EHR-M-GAN may have use in developing AI algorithms in resource-limited settings, lowering the barrier for data acquisition while preserving patient privacy.

Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Publisher copy:
10.1038/s41746-023-00834-7

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MSD
Department:
Nuffield Department of Population Health
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Oxford college:
Kellogg College
Role:
Author
ORCID:
0000-0002-1552-5630


Publisher:
Springer Nature
Journal:
NPJ Digital Medicine More from this journal
Volume:
6
Issue:
1
Article number:
98
Publication date:
2023-05-27
Acceptance date:
2023-05-05
DOI:
EISSN:
2398-6352
Pmid:
37244963


Language:
English
Keywords:
Pubs id:
1346934
Local pid:
pubs:1346934
Deposit date:
2023-06-06

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP