Conference item icon

Conference item

Learned geospatial priors for malaria forecasting

Abstract:

Malaria remains one of the leading communicable causes of death globally, with approximately half the world’s population at risk, predominantly across African and South Asian countries. Accurate, region-specific outbreak prediction is hindered by strong spatio-temporal heterogeneity in environmental, climatological, and sociodemographic risk factors, as well as by the over-dispersed, count-valued nature of incidence data that violates assumptions of standard regression and many off-the-shelf machine learning approaches. We address these challenges through a unified framework combining count-aware statistical modelling, sequence learning, and geospatial foundation model representations, evaluated across India and Nigeria using monthly data from 2000 to 2024 integrating rainfall, temperature, vegetation index, nighttime lights, population counts, distance to water bodies, and historical malaria case counts. First, we model malaria incidence using lagstructured negative binomial regression, explicitly accounting for over-dispersion while capturing biologically grounded temporal effects; this yields interpretable estimates of how environmental and sociodemographic covariates drive transmission. Second, we develop an ensemble of LSTM and Transformer models that captures complementary temporal dependencies in incidence dynamics. Third, we augment both approaches with embeddings from the AlphaEarth foundation model, which encode high-dimensional satellite-derived environmental context. Incorporating these embeddings consistently improves predictive performance, with the ensemble model increasing R2 from 0.63 to 0.78 in Nigeria and from 0.87 to 0.88 in India, while remaining stable across random seeds. Together, these results suggest that foundation model embeddings convert previously unexplained variance into structured environmental signal, reducing effective over-dispersion and providing a robust and transferable framework for forecasting climate-sensitive infectious diseases.

Publication status:
Submitted

Actions

Access Document

Authors

More by this author
Institution:
University of Oxford
Division:
MSD
Department:
NDORMS
Sub department:
Botnar Institute for Musculoskeletal Sciences
Role:
Author
ORCID:
0000-0001-5741-9062
More by this author
Institution:
University of Oxford
Division:
MSD
Department:
NDORMS
Role:
Author


Publisher:
NeurIPS
Acceptance date:
2026-09-24
Event title:
40th Annual Conference on Neural Information Processing Systems (NeurIPS 2026)
Event location:
Sydney, Australia
Event website:
https://neurips.cc/Conferences/2026
Event start date:
2026-12-06
Event end date:
2026-12-12


Language:
English
Pubs id:
2434557
Local pid:
pubs:2434557
Deposit date:
2026-06-18
ARK identifier:

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP