Journal article icon

Journal article

Limitations of current high-throughput sequencing technologies lead to biased expression estimates of endogenous retroviral elements

Abstract:
Human endogenous retroviruses (HERVs), the remnants of ancient germline retroviral integrations, comprise almost 8% of the human genome. The elucidation of their biological roles is hampered by our inability to link HERV mRNA and protein production with specific HERV loci. To solve the riddle of the integration-specific RNA expression of HERVs, several bioinformatics approaches have been proposed; however, no single process seems to yield optimal results due to the repetitiveness of HERV integrations. The performance of existing data-bioinformatics pipelines has been evaluated against real world datasets whose true expression profile is unknown, thus the accuracy of widely-used approaches remains unclear. Here, we simulated mRNA production from specific HERV integrations to evaluate second and third generation sequencing technologies along with widely used bioinformatic approaches to estimate the accuracy in describing integration-specific expression. We demonstrate that, while a HERV-family approach offers accurate results, per-integration analyses of HERV expression suffer from substantial expression bias, which is only partially mitigated by algorithms developed for calculating the per-integration HERV expression, and is more pronounced in recent integrations. Hence, this bias could erroneously result into biologically meaningful inferences. Finally, we demonstrate the merits of accurate long-read high-throughput sequencing technologies in the resolution of per-locus HERV expression.
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Files:
Publisher copy:
10.1093/nargab/lqae081

Authors


More by this author
Institution:
University of Oxford
Role:
Author
More by this author
Role:
Author
ORCID:
0000-0002-0141-4753


Publisher:
Oxford University Press
Journal:
NAR Genomics and Bioinformatics More from this journal
Volume:
6
Issue:
3
Article number:
lqae081
Publication date:
2024-07-09
Acceptance date:
2024-06-27
DOI:
EISSN:
2631-9268
ISSN:
2631-9268


Language:
English
Pubs id:
2013326
Local pid:
pubs:2013326
Source identifiers:
2097109
Deposit date:
2024-07-09

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP