Genome-wide identification of human functional DNA using a neutral indel model

Lunter, G; Ponting, C; Hein, J

Journal article

Genome-wide identification of human functional DNA using a neutral indel model

Abstract:: It has become clear that a large proportion of functional DNA in the human genome does not code for protein. Identification of this non-coding functional sequence using comparative approaches is proving difficult and has previously been thought to require deep sequencing of multiple vertebrates. Here we introduce a new model and comparative method that, instead of nucleotide substitutions, uses the evolutionary imprint of insertions and deletions (indels) to infer the past consequences of selection. The model predicts the distribution of indels under neutrality, and shows an excellent fit to human-mouse ancestral repeat data. Across the genome, many unusually long ungapped regions are detected that are unaccounted for by the neutral model, and which we predict to be highly enriched in functional DNA that has been subject to purifying selection with respect to indels. We use the model to determine the proportion under indel-purifying selection to be between 2.56% and 3.25% of human euchromatin. Since annotated protein-coding genes comprise only 1.2% of euchromatin, these results lend further weight to the proposition that more than half the functional complement of the human genome is non-protein-coding. The method is surprisingly powerful at identifying selected sequence using only two or three mammalian genomes. Applying the method to the human, mouse, and dog genomes, we identify 90 Mb of human sequence under indel-purifying selection, at a predicted 10% false-discovery rate and 75% sensitivity. As expected, most of the identified sequence represents unannotated material, while the recovered proportions of known protein-coding and microRNA genes closely match the predicted sensitivity of the method. The method's high sensitivity to functional sequence such as microRNAs suggest that as yet unannotated microRNA genes are enriched among the sequences identified. Furthermore, its independence of substitutions allowed us to identify sequence that has been subject to heterogeneous selection, that is, sequence subject to both positive selection with respect to substitutions and purifying selection with respect to indels. The ability to identify elements under heterogeneous selection enables, for the first time, the genome-wide investigation of positive selection on functional elements other than protein-coding genes.

Publication status:: Published

Peer review status:: Peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Share
Cite

Cite this record

APA Style

Lunter, G., Ponting, C., & Hein, J. (2006). Genome-wide identification of human functional DNA using a neutral indel model. PLoS Computational Biology, 2(1), e5.

MLA Style

Lunter, G, et al. “Genome-Wide Identification of Human Functional DNA Using a Neutral Indel Model.” PLoS Computational Biology, vol. 2, no. 1, 2006, p. e5.

Chicago Style

Lunter, G, C Ponting, and J Hein. 2006. “Genome-Wide Identification of Human Functional DNA Using a Neutral Indel Model.” PLoS Computational Biology 2 (1): e5.
Print

Access Document

Files:: Genome-Wide Identification of Human Functional DNA Using a Neu...

(Preview, Version of record, pdf, 541.6KB, Terms of use)

Publisher copy:: 10.1371/journal.pcbi.0020005

Authors

+ Lunter, G More by this author

Role:: Author

+ Ponting, C More by this author

Institution:: University of Oxford
Division:: Medical Sciences Division
Department:: Physiology Anatomy and Genetics
Role:: Author

+ Hein, J More by this author

Role:: Author

Publisher:: Public Library of Science
Journal:: PLoS Computational Biology More from this journal
Volume:: 2
Issue:: 1
Pages:: e5
Publication date:: 2006-01-13
Acceptance date:: 2005-11-30
DOI:: 10.1371/journal.pcbi.0020005
EISSN:: 1553-7358
ISSN:: 1553-734X

Language:: English
Keywords:: Genome, Human

Models, Genetic

Base Sequence

Humans

Selection, Genetic

Genomics

Genetic Variation

SBTMR

Base Composition

DNA

Animals
Pubs id:: 68272
UUID:: uuid:dd61f723-e55b-4935-8280-2d100d8572a0
Local pid:: pubs:68272
Source identifiers:: 68272
Deposit date:: 2012-12-19
ARK identifier:: ark:/29072/ora_dd61f723e55b493582802d100d8572a0

Terms of use

Copyright holder:: Lunter et al
Notes:: © 2006 Lunter et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Licence:: CC Attribution (CC BY)

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

Journal article

Genome-wide identification of human functional DNA using a neutral indel model

Actions

Access Document

Authors

Terms of use

Views and Downloads

Altmetrics

Dimensions

Journal article

Genome-wide identification of human functional DNA using a neutral indel model

Actions

Access Document

Authors

Bibliographic Details

Item Description

Terms of use

Metrics

Views and Downloads

Altmetrics

Dimensions