Thesis
Genetic and genomic analysis of Arabidopsis thaliana with low-coverage next-generation sequencing data
- Abstract:
-
Next-generation sequencing technologies have transformed our understanding of genetic variation segregating in populations and its relationship with phenotypic traits. Sequencing large populations at low coverage, thus sampling only a fraction of the genome of each individual, may increase statistical power in genetic mapping [Pasaniuc,2012] compared to genotyping arrays. This thesis explores several novel applications of low-coverage population-based sequencing, using data from 488 recombinant inbred lines from the MAGIC population of Arabidopsis thaliana, descended from 19 inbred founder accessions. Based on the full catalogue of genetic variation that is available in the 19 founders [Gan, 2011], I describe every MAGIC genome as a mosaic of founder haplotypes and analyse the accuracy of the mosaics by simulation. I then use the mosaics in three ways. First, I investigate structural variation using a novel method that treats anomalies in the alignment of sequencing reads, potentially representing signatures of structural variants (SVs), as quantitative traits. These can be mapped genetically to identify loci in which genetic variation correlates with signatures of SVs. The method can distinguish short- (e.g. indels) and long-range (e.g. translocations) SVs and has led to the discovery of a large number of SVs segregating in the MAGIC population, including thousands of long-range SVs. I show that SVs have a significant impact on silencing gene expression and that they explain a large fraction of the phenotypic variation in several physiological traits. Second, I use the mosaic structure of the MAGIC lines to map recombination events and analyse lineage-specific recombination in MAGIC. I infer recombination hotspots and compared recombination in the MAGIC lines to the Arabidopsis genetic map. Finally, I detect bacterial endosymbionts hosted in MAGIC genomes from unmapped reads that have high sequence similarity with bacterial DNA and examine whether variation in the presence of endosymbionts can be explained by host genetic variation.
Actions
Access Document
- Files:
-
-
(Preview, Dissemination version, pdf, 31.1MB, Terms of use)
-
- DOI:
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Subjects:
- UUID:
-
uuid:136aed76-8a6a-401f-99e4-559a2939cd21
- Deposit date:
-
2016-04-10
- ARK identifier:
Terms of use
- Copyright holder:
- Martha Imprialou
- Copyright date:
- 2015
If you are the owner of this record, you can report an update to it here: Report update to this record