Journal article icon

Journal article

Phasing for medical sequencing using rare variants and large haplotype reference panels

Abstract:

Motivation: There is growing recognition that estimating haplotypes from high coverage sequencing of single samples in clinical settings is an important problem. At the same time very large datasets consisting of tens and hundreds of thousands of high-coverage sequenced samples will soon be available. We describe a method that takes advantage of these huge human genetic variation resources and rare variant sharing patterns to estimate haplotypes on single sequenced samples. Sharing rare variants between two individuals is more likely to arise from a recent common ancestor and, hence, also more likely to indicate similar shared haplotypes over a substantial flanking region of sequence.

Results: Our method exploits this idea to select a small set of highly informative copying states within a Hidden Markov Model (HMM) phasing algorithm. Using rare variants in this way allows us to avoid iterative MCMC methods to infer haplotypes. Compared to other approaches that do not explicitly use rare variants we obtain significant gains in phasing accuracy, less variation over phasing runs and improvements in speed. For example, using a reference panel of 7420 haplotypes from the UK10K project, we are able to reduce switch error rates by up to 50% when phasing samples sequenced at high-coverage. In addition, a single step rephasing of the UK10K panel, using rare variant information, has a downstream impact on phasing performance. These results represent a proof of concept that rare variant sharing patterns can be utilized to phase large high-coverage sequencing studies such as the 100 000 Genomes Project dataset.
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Files:
Publisher copy:
10.1093/bioinformatics/btw065

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Statistics
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MSD
Department:
NDM
Sub department:
Human Genetics Wt Centre
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MSD
Department:
NDM
Sub department:
Human Genetics Wt Centre
Role:
Author


More from this funder
Funding agency for:
Marchini, J
Grant:
617306


Publisher:
Oxford University Press
Journal:
Bioinformatics More from this journal
Volume:
32
Issue:
13
Pages:
1974-1980
Publication date:
2016-02-27
Acceptance date:
2016-01-29
DOI:
EISSN:
1460-2059
ISSN:
1367-4803


Pubs id:
pubs:598573
UUID:
uuid:0ee8f290-b073-4605-a474-fc164a81f86f
Local pid:
pubs:598573
Source identifiers:
598573
Deposit date:
2016-02-01
ARK identifier:

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP