A Hierarchical Bayesian Meta-Analytic Helicobacter Pylori (H. pylori) Infection Generator for the U.S. Population
Poster — SMDM 18th Biennial European Conference, Berlin, Germany
A Hierarchical Bayesian H. pylori Infection Generator for the US
Rembrandt van Rijn — De Nachtwacht (1642) Rijksmuseum · Amsterdam A company of civic guards steps forward from darkness into light — each figure positioned by rank, each detail deliberate. Rembrandt painted collective action as a problem of synthesis: many individuals, one coherent picture, one moment of movement caught from the archive.
To model the impact of H. pylori screen-and-treat strategies on gastric cancer in the United States, you need to know how fast the infection spreads — stratified by race and ethnicity, across age groups, and accounting for the fact that transmission rates have been declining for decades. No single national survey provides this. This poster introduced the infection generator that makes the downstream models possible: a hierarchical Bayesian framework that synthesizes 41 published US seroprevalence studies into race-stratified, age-specific force-of-infection curves.
The Problem
H. pylori is the primary cause of non-cardia gastric cancer and is responsible for roughly 90% of duodenal ulcers and 70–80% of gastric ulcers worldwide. In the United States, its burden falls disproportionately on racial and ethnic minorities — American Indian/Alaska Native, Hispanic, and non-Hispanic Black populations face seroprevalence rates of 43–82%, compared to 8–37% in non-Hispanic White populations.
Modeling the long-term health and economic consequences of H. pylori eradication strategies requires dynamic transmission inputs: not just current prevalence, but the rate at which susceptible individuals acquire infection (the force of infection), stratified by age and race/ethnicity, and corrected for birth-cohort trends showing that transmission has been declining across generations. These inputs do not exist in any single US dataset. The largest national survey — NHANES — captures seroprevalence but lacks the geographic and racial depth needed for stratified modeling. Dozens of smaller published studies exist, but synthesizing them requires resolving incompatible sampling frames, assay methods, and covariate definitions.
The Approach
We extended a validated hierarchical Bayesian methodology first developed for Mexico — where a single national seroepidemiological survey existed — to the US context, where no equivalent survey exists. Instead of a single dataset, the model pools seroprevalence estimates from 41 published US studies (56,845 individuals, data collected 1965–2014), identified through a companion scoping review following PRISMA-ScR guidelines.
The model fits a modified exponential catalytic curve to age-stratified seroprevalence by race/ethnicity, estimating:
- α — the fraction of each racial/ethnic group ultimately susceptible to infection
- γ — the annual force of infection (instantaneous rate of acquisition)
Parameters are estimated jointly using a hierarchical Bayesian structure that shrinks race-specific curves toward a grand mean, enabling stable inference even where individual-group data are sparse. The model is implemented in JAGS via Gibbs sampling, producing full posterior distributions over both parameters and the predicted seroprevalence curves.
What We Found
The model recovered race-stratified force-of-infection estimates with full uncertainty quantification:
- American Indian/Alaska Native and Hispanic populations carry the highest infection burden (seroprevalence 43–91%), reflecting both elevated transmission rates and lower historical access to eradication treatment.
- Non-Hispanic White populations show the lowest burden (8–37%), with declining trends across birth cohorts consistent with improving sanitation and earlier treatment.
- All groups show evidence of declining transmission across birth cohorts — more recent generations are acquiring H. pylori later and at lower rates than those born earlier in the 20th century.
The infection generator produces posterior predictive seroprevalence curves by age and race/ethnicity — continuous, probabilistic inputs that downstream decision-analytic models can sample directly, propagating structural uncertainty through to cost-effectiveness estimates.
Why It Matters
This infection generator is the upstream input for all US-based H. pylori decision-analytic models in the Stanford group’s research program. Without race-stratified force-of-infection estimates, transmission models collapse heterogeneous populations into a single average — hiding the disparities that health policy needs to see. The framework has since been deployed in a full SIS dynamic transmission model (Kaufmann, Roa, et al.) evaluating mass testing and eradication strategies across the US population, and its outputs directly inform estimates of who benefits most from screen-and-treat programs targeting gastric cancer prevention.
Citation
Roa, J. (2023). A Hierarchical Bayesian Meta-Analytic Helicobacter Pylori (H. pylori) Infection Generator for the U.S. Population. Society for Medical Decision Making (SMDM) 18th Biennial European Conference, Berlin, Germany.
Citation
@inproceedings{j.2023,
author = {J. , Roa},
title = {A {Hierarchical} {Bayesian} {Meta-Analytic} {Helicobacter}
{Pylori} {(H.} Pylori) {Infection} {Generator} for the {U.S.}
{Population}},
booktitle = {SMDM 18th Biennial European Conference},
date = {2023-10-22},
url = {https://jorgeroac.com/publications/papers/conference-presentations/hpylori-poster-smdm/},
langid = {en}
}