Journal article icon

Journal article

Tumour purity assessment with deep learning in colorectal cancer and impact on molecular analysis

Abstract:
Tumour content plays a pivotal role in directing the bioinformatic analysis of molecular profiles such as copy number variation (CNV). In clinical application, tumour purity estimation (TPE) is achieved either through visual pathological review [conventional pathology (CP)] or the deconvolution of molecular data. While CP provides a direct measurement, it demonstrates modest reproducibility and lacks standardisation. Conversely, deconvolution methods offer an indirect assessment with uncertain accuracy, underscoring the necessity for innovative approaches. SoftCTM is an open-source, multiorgan deep-learning (DL) model for the detection of tumour and non-tumour cells in H&E-stained slides, developed within the Overlapped Cell on Tissue Dataset for Histopathology (OCELOT) Challenge 2023. Here, using three large multicentre colorectal cancer (CRC) cohorts (N = 1,097 patients) with digital pathology and multi-omic data, we compare the utility and accuracy of TPE with SoftCTM versus CP and bioinformatic deconvolution methods (RNA expression, DNA methylation) for downstream molecular analysis, including CNV profiling. SoftCTM showed technical repeatability when applied twice on the same slide (r = 1.0) and excellent correlations in paired H&E slides (r > 0.9). TPEs profiled by SoftCTM correlated highly with RNA expression (r = 0.59) and DNA methylation (r = 0.40), while TPEs by CP showed a lower correlation with RNA expression (r = 0.41) and DNA methylation (r = 0.29). We show that CP and deconvolution methods respectively underestimate and overestimate tumour content compared to SoftCTM, resulting in 6-13% differing CNV calls. In summary, TPE with SoftCTM enables reproducibility, automation, and standardisation at single-cell resolution. SoftCTM estimates (M = 58.9%, SD ±16.3%) reconcile the overestimation by molecular data extrapolation (RNA expression: M = 79.2%, SD ±10.5, DNA methylation: M = 62.7%, SD ±11.8%) and underestimation by CP (M = 35.9%, SD ±13.1%), providing a more reliable middle ground. A fully integrated computational pathology solution could therefore be used to improve downstream molecular analyses for research and clinics.
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Publisher copy:
10.1002/path.6376

Authors

More by this author
Role:
Author
ORCID:
0009-0009-6703-9368
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Sub department:
Institute of Biomedical Engineering
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MSD
Department:
Oncology
Role:
Author
ORCID:
0000-0002-8154-6253


More from this funder
Funder identifier:
https://ror.org/03x94j517
Grant:
MR/M016587/1
More from this funder
Funder identifier:
https://ror.org/0439y7842
Grant:
EP/M013774/1


Publisher:
Wiley
Journal:
Journal of Pathology More from this journal
Volume:
265
Issue:
2
Pages:
184-197
Place of publication:
England
Publication date:
2024-12-22
Acceptance date:
2024-10-29
DOI:
EISSN:
1096-9896
ISSN:
0022-3417
Pmid:
39710952


Language:
English
Keywords:
Pubs id:
2072635
Local pid:
pubs:2072635
Deposit date:
2025-01-06
ARK identifier:

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP