Journal article icon

Journal article

Prior knowledge on context-driven DNA fragmentation probabilities can improve de novo genome assembly algorithms

Abstract:
Background: De novo genome assembly poses challenges when dealing with highly degraded DNA samples or ultrashort sequencing reads. Probabilistic approaches have been offered to enhance the algorithms, though existing methods rely solely on expected k-meric frequencies in the assemblies, neglecting the broader sequence context that strongly influences DNA fragmentation patterns. Results: Here, we present a proof of concept showing that prior knowledge on sequence context-driven DNA breakage propensities, through the dedicated parameterisation of k-mer assigned breakage probabilities, can be utilised to recover DNA assemblies that originate from fragmentation patterns more likely to have happened. Our approach is beneficial even for read lengths below the common 25 bp threshold of modern de novo genome assembly algorithms, and well below the threshold used for ultrashort fragments used in ancient DNA research. Conclusions: This work could lay the groundwork for future enhanced de novo genome assembly algorithms, with improved ability to effectively assemble and evaluate ultrashort DNA fragments relevant for cell-free, ancient, and forensic DNA research.
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Authors

More by this author
Institution:
University of Oxford
Division:
MSD
Department:
Radcliffe Department of Medicine
Sub department:
RDM-Strategic
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MSD
Department:
Radcliffe Department of Medicine
Sub department:
RDM-Strategic
Role:
Author


Publisher:
BioMed Central
Journal:
BMC Bioinformatics More from this journal
Volume:
26
Issue:
1
Article number:
245
Publication date:
2025-10-13
Acceptance date:
2025-09-05
DOI:
EISSN:
1471-2105
ISSN:
1471-2105


Language:
English
Keywords:
Pubs id:
2302553
Local pid:
pubs:2302553
Source identifiers:
3368036
Deposit date:
2025-10-13
ARK identifier:
This ORA record was generated from metadata provided by an external service. It has not been edited by the ORA Team.

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP