Thesis
Biobank-scale ancestral recombination graphs: inference and applications to the analysis of complex traits
- Abstract:
-
Across living species, DNA is transmitted from generation to generation via the processes of inheritance, mutation, and recombination. The history of these processes can be recorded using genome-wide gene genealogies. Accurate inference of gene genealogies from genetic data has the potential to facilitate a wide range of analyses, but is computationally challenging. In this thesis, we introduce a scalable method, called ARG-Needle, that uses genotype hashing and a coalescent hidden Markov model to infer genome-wide genealogies from sequencing or genotyping array data in modern biobanks. We develop strategies that utilise the inferred genome-wide genealogies within linear mixed models to perform association and other analyses of biomedical traits.
We validate the accuracy and scalability of ARG-Needle through extensive coalescent simulations, and use ARG-Needle to build genome-wide genealogies from genotypes of 337,464 UK Biobank individuals. We perform genealogy-based association analysis of 7 complex traits, detecting more rare and ultra-rare signals (N = 133, frequency range 0.0004% − 0.1%) than genotype imputation from ∼65,000 sequenced haplotypes (N = 65). We validate these signals using exome sequencing data from 138,039 individuals. ARG-Needle associations strongly tag (average r = 0.72) underlying sequencing variants that are enriched for missense (2.3×) and loss-of-function (4.5×) variation. Compared to imputation, inferred genealogies also capture additional signals for higher frequency variants. These results demonstrate that biobank-scale inference of gene genealogies may be leveraged in the analysis of complex traits, complementing approaches that require the availability of large, population-specific sequencing panels.
Actions
Authors
Contributors
- Institution:
- University of Oxford
- Division:
- MPLS
- Department:
- Statistics
- Role:
- Supervisor
- Funder identifier:
- http://dx.doi.org/10.13039/501100014748
- Grant:
- Clarendon Scholarship
- Programme:
- Clarendon Scholarship
- Funder identifier:
- http://dx.doi.org/10.13039/501100000781
- Grant:
- ARGPHENO 850869
- Programme:
- ERC Starting Grant no. ARGPHENO 850869
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Subjects:
- Deposit date:
-
2023-06-05
Terms of use
- Copyright holder:
- Zhang, BC
- Copyright date:
- 2022
If you are the owner of this record, you can report an update to it here: Report update to this record