Journal article
Inference from samples of DNA sequences using a two-locus model.
- Abstract:
- Performing inference on contemporary samples of DNA sequence data is an important and challenging task. Computationally intensive methods such as importance sampling (IS) are attractive because they make full use of the available data, but in the presence of recombination the large state space of genealogies can be prohibitive. In this article, we make progress by developing an efficient IS proposal distribution for a two-locus model of sequence data. We show that the proposal developed here leads to much greater efficiency, outperforming existing IS methods that could be adapted to this model. Among several possible applications, the algorithm can be used to find maximum likelihood estimates for mutation and crossover rates, and to perform ancestral inference. We illustrate the method on previously reported sequence data covering two loci either side of the well-studied TAP2 recombination hotspot. The two loci are themselves largely non-recombining, so we obtain a gene tree at each locus and are able to infer in detail the effect of the hotspot on their joint ancestry. We summarize this joint ancestry by introducing the gene graph, a summary of the well-known ancestral recombination graph.
- Publication status:
- Published
Actions
Access Document
- Publisher copy:
- 10.1089/cmb.2009.0231
Authors
- Journal:
- Journal of computational biology : a journal of computational molecular cell biology More from this journal
- Volume:
- 18
- Issue:
- 1
- Pages:
- 109-127
- Publication date:
- 2011-01-01
- DOI:
- EISSN:
-
1557-8666
- ISSN:
-
1066-5277
- Language:
-
English
- Keywords:
- Pubs id:
-
pubs:112491
- UUID:
-
uuid:09a641b9-3840-4858-b448-1404792ebeb7
- Local pid:
-
pubs:112491
- Source identifiers:
-
112491
- Deposit date:
-
2012-12-19
- ARK identifier:
Terms of use
- Copyright date:
- 2011
If you are the owner of this record, you can report an update to it here: Report update to this record