Journal article
The challenges of replication: A worked example of methods reproducibility using electronic health record data
- Abstract:
- Objective: The ability to reproduce the work of others is an essential part of the scientific disciplines. Replicating observational studies using electronic health record (EHR) data can be challenging due to complexities in data access, variations in EHR systems across institutions, and the potential for unaccounted confounding variables. Our aim is to identify the barriers to methods reproducibility for replication studies using EHR data. Methods: We replicated a study that examined the risk of hospitalisation following a positive COVID-19 test in individuals with diabetes. Using EHR data from the NHS England’s Secure Data Environment (SDE) covering the whole of England, UK (population 57m), we sought to replicate findings from the original study, which used data from Greater Manchester (a large urban region in the UK, population 2.9m). Both analyses were conducted in Trusted Research Environments (TREs) or SDEs, containing linked primary and secondary care data, however methods reproducibility was not straightforward. Differences between the environments that contributed to the difficulties were documented, categorized into themes, and converted into a list of recommendations for TRE/SDEs. Results: Small differences between the environments and the data sources led to several challenges in methods reproducibility. Our recommendations of TRE/SDEs should facilitate future replication studies. The recommendations include: a need for improved machine-readable metadata for EHR data; standardization of governance processes to facilitate federated analysis; mandating of code sharing; and for environments to have a support structure for data engineers and analysts. We also propose a new theme for research, “data reproducibility”, as the ability to prepare, extract and clean data from a different database for a replication study. Conclusion: Even with perfect code sharing, data reproducibility remains a challenge. Our recommendations have the potential to reduce the barriers to replication studies and therefore enhance the potential of observational studies using EHR data.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Version of record, pdf, 332.8KB, Terms of use)
-
(Preview, Other, pdf, 249.4KB, Terms of use)
-
- Publisher copy:
- 10.1371/journal.pone.0326335
Authors
+ NIHR Cambridge Biomedical Research Centre
More from this funder
- Funder identifier:
- https://ror.org/05m8dr349
+ NIHR Manchester Biomedical Research Centre
More from this funder
- Funder identifier:
- https://ror.org/05njkjr15
- Publisher:
- Public Library of Science
- Journal:
- PLoS ONE More from this journal
- Volume:
- 20
- Issue:
- 7
- Article number:
- e0326335
- Publication date:
- 2025-07-18
- Acceptance date:
- 2025-05-28
- DOI:
- EISSN:
-
1932-6203
- Language:
-
English
- Source identifiers:
-
3128360
- Deposit date:
-
2025-07-18
This ORA record was generated from metadata provided by an external service. It has not been edited by the ORA Team.
If you are the owner of this record, you can report an update to it here: Report update to this record