Thesis
Identifying populations at high risk of infections and antimicrobial resistance using large-scale electronic health record data
- Abstract:
-
Identifying populations at increased risk of infections can help inform public health strategies to reduce the incidence of disease. Electronic health records (EHRs) and large cohort data offer an opportunity to consider and identify many risk factors for various infections; however, it is unclear how to achieve this, especially in near real-time scenarios. This was particularly important during the rapidly evolving COVID-19 pandemic but is also important for monitoring common bloodstream infections (BSIs) such as Escherichia coli (E. coli) BSIs. This thesis therefore aimed to identify populations at a higher risk of infections using large datasets.
I first developed a real-time screening process to monitor associations between SARS-CoV-2 infection and demographic and behavioural risk factors. I considered potential confounders, multiple testing, collinearity, and reverse causality during the development of the process and demonstrated its use between July 2020-2021. I then explored methods to identify changes in growth rates of SARS-CoV-2 prevalence, comparing Iterative Sequential Regression and second derivates of generalised additive models. I found that both methods could find change-points around 3-5 weeks after they occurred in the data and that change-points could be detected earlier within specific subgroups. I next explored whether I could extend my learning and the methods I developed for use during the COVID-19 pandemic to investigate different diseases in varying data sources. I investigated the challenges in using large-scale EHRs when conducting case-control studies to identify risk factors for E. coli BSIs, specifically how to define control groups and risk factors for analyses of routinely collected data. I found missing data to be a key component when choosing a control group and that reverse causality could impact associations between calculated risk factors and E. coli BSIs. Finally, I extended and implemented the screening process developed originally for the COVID-19 pandemic on EHR data to identify associations between risk factors and E. coli BSIs. I discussed potential interventions based on these findings.
Overall, this thesis demonstrated effective methods for identifying populations at increased risk of infectious diseases using large datasets. With the continuing growth of EHRs, leveraging these resources to monitor at-risk populations could enhance the targeting of future interventions, ultimately aiming to reduce the burden of disease.
Actions
Authors
Contributors
- Institution:
- University of Oxford
- Division:
- MSD
- Department:
- NDM
- Role:
- Supervisor
- Institution:
- University of Oxford
- Division:
- MSD
- Department:
- NDM
- Sub department:
- Big Data Institute
- Role:
- Supervisor
- ORCID:
- 0000-0001-5095-6367
- Role:
- Supervisor
- Funder identifier:
- https://ror.org/0187kwz08
- Grant:
- NIHR200915
- DOI:
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Keywords:
- Subjects:
- Deposit date:
-
2024-10-23
If you are the owner of this record, you can report an update to it here: Report update to this record