Preprint icon

Preprint

Completeness of reporting of clinical prediction models developed using supervised machine learning: a systematic review

Abstract:

Objective While many studies have consistently found incomplete reporting of regression-based prediction model studies, evidence is lacking for machine learning-based prediction model studies. We aim to systematically review the adherence of Machine Learning (ML)-based prediction model studies to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Statement.

Study design and setting We included articles reporting on development or external validation of a multivariable prediction model (either diagnostic or prognostic) developed using supervised ML for individualized predictions across all medical fields (PROSPERO, CRD42019161764). We searched PubMed from 1 January 2018 to 31 December 2019. Data extraction was performed using the 22-item checklist for reporting of prediction model studies (www.TRIPOD-statement.org). We measured the overall adherence per article and per TRIPOD item.

Results Our search identified 24 814 articles, of which 152 articles were included: 94 (61.8%) prognostic and 58 (38.2%) diagnostic prediction model studies. Overall, articles adhered to a median of 38.7% (IQR 31.0-46.4) of TRIPOD items. No articles fully adhered to complete reporting of the abstract and very few reported the flow of participants (3.9%, 95% CI 1.8 to 8.3), appropriate title (4.6%, 95% CI 2.2 to 9.2), blinding of predictors (4.6%, 95% CI 2.2 to 9.2), model specification (5.2%, 95% CI 2.4 to 10.8), and model’s predictive performance (5.9%, 95% CI 3.1 to 10.9). There was often complete reporting of source of data (98.0%, 95% CI 94.4 to 99.3) and interpretation of the results (94.7%, 95% CI 90.0 to 97.3).

Conclusion Similar to prediction model studies developed using conventional regression-based techniques, the completeness of reporting is poor. Essential information to decide to use the model (i.e. model specification and its performance) is rarely reported. However, some items and sub-items of TRIPOD might be less suitable for ML-based prediction model studies and thus, TRIPOD requires extensions. Overall, there is an urgent need to improve the reporting quality and usability of research to avoid research waste.

Publication status:
Published
Peer review status:
Not peer reviewed

Actions


Access Document


Preprint server copy:
10.1101/2021.06.28.21259089

Authors


More by this author
Role:
Author
ORCID:
0000-0001-7401-4593
More by this author
Role:
Author
ORCID:
0000-0002-8032-6224
More by this author
Role:
Author
ORCID:
0000-0001-6798-2078
More by this author
Institution:
University of Oxford
Division:
MSD
Department:
NDORMS
Sub department:
Botnar Institute for Musculoskeletal Sciences
Role:
Author
ORCID:
0000-0002-0989-0623


More from this funder
Funder identifier:
https://ror.org/054225q67
Funding agency for:
Collins, GS
Grant:
C49297/A27294
More from this funder
Funder identifier:
https://ror.org/00aps1a34
Funding agency for:
Dhiman, P
Collins, GS


Preprint server:
medRxiv
Publication date:
2021-07-01
DOI:


Language:
English
Keywords:
Pubs id:
1185445
Local pid:
pubs:1185445
Deposit date:
2025-03-17

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP