Journal article icon

Journal article

ArcTEX—a novel clinical data enrichment pipeline to support real-world evidence oncology studies

Abstract:
Data stored within electronic health records (EHRs) offer a valuable source of information for real-world evidence (RWE) studies in oncology. However, many key clinical features are only available within unstructured notes. We present ArcTEX, a novel data enrichment pipeline developed to extract oncological features from NHS unstructured clinical notes with high accuracy, even in resource-constrained environments where availability of GPUs might be limited. By design, the predicted outcomes of ArcTEX are free of patient-identifiable information, making this pipeline ideally suited for use in Trust environments. We compare our pipeline to existing discriminative and generative models, demonstrating its superiority over approaches such as Llama3/3.1/3.2 and other BERT based models, with a mean accuracy of 98.67% for several essential clinical features in endometrial and breast cancer. Additionally, we show that as few as 50 annotated training examples are needed to adapt the model to a different oncology area, such as lung cancer, with a different set of priority clinical features, achieving a comparable mean accuracy of 95% on average.
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Files:
Publisher copy:
10.3389/fdgth.2025.1561358

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Computer Science
Sub department:
Computer Science
Role:
Author


Publisher:
Frontiers Media
Journal:
Frontiers in Digital Health More from this journal
Volume:
7
Article number:
1561358
Publication date:
2025-05-09
Acceptance date:
2025-04-23
DOI:
EISSN:
2673-253X


Language:
English
Keywords:
Source identifiers:
2950680
Deposit date:
2025-05-23
This ORA record was generated from metadata provided by an external service. It has not been edited by the ORA Team.

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP