Journal article icon

Journal article

Unified framework for the ingestion of early epidemic data for downstream data analytics

Abstract:
BackgroundEarly-phase data during an epidemic are often heterogeneous and difficult to integrate across systems, therefore a need for standard tools and reporting guidelines to facilitate timely and reliable data collection. The Global.health team have developed a data schema for the ingestion of epidemic data, allowing interoperability where data curated to this schema are readily ingested into existing systems for analysis. This paper describes the definition of 'core data' within the Global.health schema to focus data collection on the most relevant and available data to inform epidemic response during the first 100 days of an outbreak.MethodsWe used expert consultation and a structured literature review to identify key epidemiological questions and parameters that must be addressed during the first 100 days of an outbreak. Relevant digital toolkits and reporting frameworks were reviewed, and minimum data variables required for parameter estimation were identified. These variables were mapped to the existing Global.health schema and assessed for availability in early outbreak data from four recent epidemics. Variables were categorized by availability and those with sufficient early availability were retained in a proposed core schema. Data formats were harmonized with WHO Epi Core, T0 and T1 toolkits to enhance interoperability. A complementary modular schema was defined to capture pathogen-specific variables.ResultsThe literature review yielded 78 key epidemiological parameters relevant to early outbreak assessment, organized into eleven categories. Analysis of variable availability in early outbreak datasets showed that 42 of 140 variables in the existing Global.health schema were consistently available and suitable for inclusion in a core early-epidemic schema. Variables related to demographics, case status, symptom reporting, confirmation dates, outcomes, and exposure history were frequently available, while vaccination history, detailed treatment data, and certain clinical variables were less consistently reported. The resulting core schema comprises 42 interoperable variables across seven domains and aligns with WHO data standards and controlled terminologies.ConclusionsStandardized, interoperable data capture during the early phase of epidemics is essential to enable timely estimation of key epidemiological parameters and to inform response strategies. The Global.health core schema provides a minimum, evidence-informed dataset for early outbreak investigation while maintaining compatibility with WHO reporting standards. By prioritizing variables that are both epidemiologically critical and realistically available in early data streams, this framework supports improved data harmonization, analysis, and decision-making during the first 100 days of an epidemic.
Publication status:
Published
Peer review status:
Peer reviewed

Actions

Access Document

Files:
Publisher copy:
10.12688/wellcomeopenres.24776.2

Authors

More by this author
Role:
Author
ORCID:
0000-0003-4285-2255
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Biology
Sub department:
Biology
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MSD
Department:
NDM
Sub department:
Pandemic Sciences Institute
Role:
Author


More from this funder
Funder identifier:
10.13039/100014013
Grant:
APP8583
More from this funder
Funder identifier:
https://ror.org/029chgv08
Grant:
226052


Publisher:
Taylor and Francis
Journal:
Wellcome Open Research More from this journal
Volume:
10
Pages:
524
Publication date:
2025-01-01
DOI:
EISSN:
2398502X
ISSN:
2398502X
Pmid:
42272702


Language:
English
Keywords:
Source identifiers:
4245729
Deposit date:
2026-06-19
ARK identifier:
This ORA record was generated from metadata provided by an external service. It has not been edited by the ORA Team.

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP