Journal article icon

Journal article

Developing automated methods for disease subtyping in UK Biobank: an exemplar study on stroke

Abstract:

Background: Better phenotyping of routinely collected coded data would be useful for research and health improvement. For example, the precision of coded data for hemorrhagic stroke (intracerebral hemorrhage [ICH] and subarachnoid hemorrhage [SAH]) may be as poor as < 50%. This work aimed to investigate the feasibility and added value of automated methods applied to clinical radiology reports to improve stroke subtyping.

Methods: From a sub-population of 17,249 Scottish UK Biobank participants, we ascertained those with an incident stroke code in hospital, death record or primary care administrative data by September 2015, and ≥ 1 clinical brain scan report. We used a combination of natural language processing and clinical knowledge inference on brain scan reports to assign a stroke subtype (ischemic vs ICH vs SAH) for each participant and assessed performance by precision and recall at entity and patient levels.

Results: Of 225 participants with an incident stroke code, 207 had a relevant brain scan report and were included in this study. Entity level precision and recall ranged from 78 to 100%. Automated methods showed precision and recall at patient level that were very good for ICH (both 89%), good for SAH (both 82%), but, as expected, lower for ischemic stroke (73%, and 64%, respectively), suggesting coded data remains the preferred method for identifying the latter stroke subtype.

Conclusions: Our automated method applied to radiology reports provides a feasible, scalable and accurate solution to improve disease subtyping when used in conjunction with administrative coded health data. Future research should validate these findings in a different population setting.

Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Publisher copy:
10.1186/s12911-021-01556-0

Authors

More by this author
Role:
Author
ORCID:
0000-0002-0213-5668
More by this author
Institution:
University of Oxford
Division:
MSD
Department:
Nuffield Department of Population Health
Role:
Author
ORCID:
0000-0002-4816-8991
More by this author
Institution:
University of Oxford
Division:
MSD
Department:
Nuffield Department of Population Health
Sub department:
Clinical Trial Service Unit
Role:
Author
ORCID:
0000-0003-1938-5038

Contributors

Role:
Contributor


More from this funder
Funder identifier:
https://ror.org/03x94j517
Grant:
MC_PC_18029


Publisher:
BioMed Central
Journal:
BMC Medical Informatics and Decision Making More from this journal
Volume:
21
Issue:
1
Article number:
191
Publication date:
2021-06-15
Acceptance date:
2021-06-08
DOI:
EISSN:
1472-6947
Pmid:
34130677


Language:
English
Keywords:
Pubs id:
1302407
Local pid:
pubs:1302407
Deposit date:
2025-02-03
ARK identifier:

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP