Journal article icon

Journal article

Topic modeling identifies novel genetic loci associated with multimorbidities in UK Biobank

Abstract:
Many diseases show patterns of co-occurrence, possibly driven by systemic dysregulation of underlying processes affecting multiple traits. We have developed a method (treeLFA) for identifying such multimorbidities from routine health-care data, which combines topic modeling with an informative prior derived from medical ontology. We apply treeLFA to UK Biobank data and identify a variety of topics representing multimorbidity clusters, including a healthy topic. We find that loci identified using topic weights as traits in a genome-wide association study (GWAS) analysis, which we validated with a range of approaches, only partially overlap with loci from GWASs on constituent single diseases. We also show that treeLFA improves upon existing methods like latent Dirichlet allocation in various ways. Overall, our findings indicate that topic models can characterize multimorbidity patterns and that genetic analysis of these patterns can provide insight into the etiology of complex traits that cannot be determined from the analysis of constituent traits alone.
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Files:
Publisher copy:
10.1016/j.xgen.2023.100371
Publication website:
https://pure.rug.nl/ws/files/877691422/1-s2.0-S2666979X23001660-main.pdf

Authors

More by this author
Institution:
University of Oxford
Department:
Big Data Institute
Role:
Author
ORCID:
0000-0002-6966-9306
More by this author
Institution:
University of Oxford
Department:
Big Data Institute
Role:
Author
ORCID:
0000-0001-6773-9182
More by this author
Institution:
University of Oxford
Department:
Big Data Institute
Role:
Author
ORCID:
0000-0002-4502-2209
More by this author
Institution:
University of Oxford
Department:
Big Data Institute
Role:
Author
ORCID:
0000-0002-5012-4162
More by this author
Institution:
University of Oxford
Role:
Author


More from this funder
Funder identifier:
10.13039/100010269
Grant:
100956/Z/13/Z
More from this funder
Funder identifier:
10.13039/501100005150
More from this funder
Funder identifier:
10.13039/100007421


Publisher:
Cell Press
Journal:
Cell Genomics More from this journal
Volume:
3
Issue:
8
Pages:
100371-100371
Article number:
100371
Publication date:
2023-08-01
DOI:
EISSN:
2666-979X
ISSN:
2666-979X


Language:
English
Keywords:
Pubs id:
1515781
Local pid:
pubs:1515781
Source identifiers:
W4385455057
Deposit date:
2026-05-12
ARK identifier:
This ORA record was generated from metadata provided by an external service. It has not been edited by the ORA Team.

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP