Journal article
araCNA: somatic copy number profiling using long-range sequence models
- Abstract:
- Somatic copy number alterations (CNAs) are hallmarks of cancer. Current algorithms that call CNAs from whole-genome sequenced (WGS) data have not exploited deep learning methods owing to computational scaling limitations. Here, we present a novel deep-learning approach, araCNA, trained only on simulated data that can accurately predict CNAs in real WGS cancer genomes. araCNA uses novel transformer alternatives (e.g. Mamba) to handle genomic-scale sequence lengths (∼1M) and learn long-range interactions. Results are extremely accurate on simulated data, and this zero-shot approach is on par with existing methods when applied to 50 WGS samples from the Cancer Genome Atlas. Notably, our approach requires only a tumour sample and not a matched normal sample, has fewer markers of overfitting, and performs inference in only a few minutes. araCNA demonstrates how domain knowledge can be used to simulate training sets that harness the power of modern machine learning in biological applications.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Version of record, pdf, 8.2MB, Terms of use)
-
(Preview, Other, pdf, 1.8MB, Terms of use)
-
- Publisher copy:
- 10.1093/nargab/lqaf124
Authors
+ Engineering and Physical Sciences Research Council
More from this funder
- Funder identifier:
- https://ror.org/0439y7842
- Publisher:
- Oxford University Press
- Journal:
- NAR Genomics and Bioinformatics More from this journal
- Volume:
- 7
- Issue:
- 3
- Article number:
- lqaf124
- Publication date:
- 2025-09-09
- Acceptance date:
- 2025-08-14
- DOI:
- EISSN:
-
2631-9268
- ISSN:
-
2631-9268
- Language:
-
English
- Source identifiers:
-
3270158
- Deposit date:
-
2025-09-09
- ARK identifier:
Terms of use
- Copyright date:
- 2025
- Notes:
- This work is related to the thesis Novel machine learning for applications in cancer genomics.
- Licence:
- CC Attribution (CC BY)
If you are the owner of this record, you can report an update to it here: Report update to this record