Journal article icon

Journal article

The Utility of Data Transformation for Alignment, De Novo Assembly and Classification of Short Read Virus Sequences

Abstract:
Advances in DNA sequencing technology are facilitating genomic analyses of unprecedented scope and scale, widening the gap between our abilities to generate and fully exploit biological sequence data. Comparable analytical challenges are encountered in other data-intensive fields involving sequential data, such as signal processing, in which dimensionality reduction (i.e., compression) methods are routinely used to lessen the computational burden of analyses. In this work, we explored the application of dimensionality reduction methods to numerically represent high-throughput sequence data for three important biological applications of virus sequence data: reference-based mapping, short sequence classification and de novo assembly. Leveraging highly compressed sequence transformations to accelerate sequence comparison, our approach yielded comparable accuracy to existing approaches, further demonstrating its suitability for sequences originating from diverse virus populations. We assessed the application of our methodology using both synthetic and real viral pathogen sequences. Our results show that the use of highly compressed sequence approximations can provide accurate results, with analytical performance retained and even enhanced through appropriate dimensionality reduction of sequence data.
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Files:
Publisher copy:
10.3390/v11050394

Authors


More by this author
Role:
Author
ORCID:
0000-0002-3480-3819
More by this author
Role:
Author
ORCID:
0000-0002-6905-8513
More by this author
Institution:
University of Oxford
Division:
MPLS Division
Department:
Engineering Science
Role:
Author
ORCID:
0000-0002-5870-4030
More by this author
Role:
Author
ORCID:
0000-0002-3361-3351


Publisher:
MDPI
Journal:
Viruses More from this journal
Volume:
11
Issue:
5
Article number:
394
Publication date:
2019-04-26
Acceptance date:
2019-04-22
DOI:
EISSN:
1999-4915
ISSN:
1999-4915
Pmid:
31035503


Keywords:
Pubs id:
pubs:995622
UUID:
uuid:94df4046-7b0b-4dda-8cb3-522f2935dffd
Local pid:
pubs:995622
Source identifiers:
995622
Deposit date:
2019-05-20

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP