A Bayesian Monte Carlo approach for predicting the spread of infectious diseases

Autoři: Olivera Stojanović aff001;  Johannes Leugering aff001;  Gordon Pipa aff001;  Stéphane Ghozzi aff002;  Alexander Ullrich aff002
Působiště autorů: Department of Neuroinformatics, Institute of Cognitive Science, Osnabrück University, Osnabrück, Germany aff001;  Department of Infectious Diseases, Robert Koch Institute, Berlin, Germany aff002
Vyšlo v časopise: PLoS ONE 14(12)
Kategorie: Research Article
doi: https://doi.org/10.1371/journal.pone.0225838


In this paper, a simple yet interpretable, probabilistic model is proposed for the prediction of reported case counts of infectious diseases. A spatio-temporal kernel is derived from training data to capture the typical interaction effects of reported infections across time and space, which provides insight into the dynamics of the spread of infectious diseases. Testing the model on a one-week-ahead prediction task for campylobacteriosis and rotavirus infections across Germany, as well as Lyme borreliosis across the federal state of Bavaria, shows that the proposed model performs on-par with the state-of-the-art hhh4 model. However, it provides a full posterior distribution over parameters in addition to model predictions, which aides in the assessment of the model. The employed Bayesian Monte Carlo regression framework is easily extensible and allows for incorporating prior domain knowledge, which makes it suitable for use on limited, yet complex datasets as often encountered in epidemiology.

Klíčová slova:

Epidemiology – Germany – Kernel functions – Lyme disease – Probability distribution – Rotavirus infection – Campylobacteriosis – Borrelia infection


1. Faensen D, Claus H, Benzler J, Ammon A, Pfoch T, Breuer T, et al. SurvNet@RKI—a multistate electronic reporting system for communicable diseases. Euro surveillance: bulletin européen sur les maladies transmissibles = European communicable disease bulletin. 2006;11(4):100–103.

2. Noufaily A, Enki DG, Farrington P, Garthwaite P, Andrews N, Charlett A. An improved algorithm for outbreak detection in multiple surveillance systems. Statistics in Medicine. 2013;32(7):1206–1222. doi: 10.1002/sim.5595 22941770

3. Gertler M, Dürr M, Renner P, Poppert S, Askar M, Breidenbach J, et al. Outbreak of following river flooding in the city of Halle (Saale), Germany, August 2013. BMC Infectious Diseases. 2015;15(1):1–10. doi: 10.1186/s12879-015-0807-1

4. Salmon M, Schumacher D, Burmann H, Frank C, Claus H, Höhle M. A system for automated outbreak detection of communicable diseases in Germany. Euro Surveillance: Bulletin Europeen Sur Les Maladies Transmissibles = European Communicable Disease Bulletin. 2016;21(13).

5. Kulldorff M. A spatial scan statistic. Communications in Statistics—Theory and Methods. 1997;26(6):1481–1496. doi: 10.1080/03610929708831995

6. Kulldorff M, Heffernan R, Hartman J, Assunção R, Mostashari F. A space-time permutation scan statistic for disease outbreak detection. PLoS Medicine. 2005;2(3):0216–0224. doi: 10.1371/journal.pmed.0020059

7. Meyer S, Held L, Höhle M. Spatio-Temporal Analysis of Epidemic Phenomena Using the R Package surveillance. Journal of Statistical Software. 2017. doi: 10.18637/jss.v077.i11

8. Chen CWS, Khamthong K, Lee S. Markov switching integer-valued generalized auto-regressive conditional heteroscedastic models for dengue counts. Journal of the Royal Statistical Society: Series C (Applied Statistics);68(4):963–983. doi: 10.1111/rssc.12344

9. Xia Y, Bjørnstad O, Grenfell B. Measles Metapopulation Dynamics: A Gravity Model for Epidemiological Coupling and Dynamics. The American Naturalist. 2004;164(2):267–281. doi: 10.1086/422341 15278849

10. Held L, Meyer S. Forecasting Based on Surveillance Data. arXiv:180903735 [stat]. 2018;.

11. McCullagh P, Nelder JA. Generalized Linear Models. 2nd ed. Chapman & Hall/CRC Monographs on Statistics and Applied Probability. Chapman & Hall/CRC; 1989.

12. Lee JH, Han G, Fulp W, Giuliano A. Analysis of overdispersed count data: application to the Human Papillomavirus Infection in Men (HIM) Study. Epidemiology & Infection. 2012;140(6):1087–1094. doi: 10.1017/S095026881100166X

13. Gurland J. Some Applications of the Negative Binomial and Other Contagious Distributions. American Journal of Public Health and the Nations Health. 1959;49(10):1388–1399. doi: 10.2105/AJPH.49.10.1388

14. Coly S, Yao AF, Abrial D, Garrido M. Distributions to model overdispersed count data. Journal de la Societe Française de Statistique. 2016;157(2):39–63.

15. Gelman A. Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper). Bayesian Analysis. 2006;1(3):515–534. doi: 10.1214/06-BA117A

16. Hoffman MD, Gelman A. The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo. arXiv e-prints. 2011; p. arXiv:1111.4246.

17. Salvatier J, Wiecki TV, Fonnesbeck C. Probabilistic programming in Python using PyMC3. PeerJ Computer Science. 2016;2:e55. doi: 10.7717/peerj-cs.55

18. Gelman A, Rubin DB. Inference from Iterative Simulation Using Multiple Sequences. Statistical Science. 1992;7(4):457–472. doi: 10.1214/ss/1177011136

19. De Boor C. On calculating with B-splines. Journal of Approximation theory. 1972;6(1):50–62. doi: 10.1016/0021-9045(72)90080-9

20. Food WHOa, of the United Nations and World Organisation for Animal Health AO. The global view of campylobacteriosis: report of an expert consultation, Utrecht, Netherlands, 9-11 July 2012. World Health Organization; 2013. Available from: https://apps.who.int/iris/handle/10665/80751.

21. Parashar UD, Nelson EAS, Kang G. Diagnosis, management, and prevention of rotavirus gastroenteritis in children. BMJ (Clinical research ed). 2013;347:f7204.

22. Steere AC, Strle F, Wormser GP, Hu LT, Branda JA, Hovius JWR, et al. Lyme borreliosis. Nature reviews Disease primers. 2016;2:16090. doi: 10.1038/nrdp.2016.90 27976670

23. Stutzer A, Frey BS. Commuting and Life Satisfaction in Germany. Informationen zur Raumentwicklung. 2007;.

24. Burda MC, Weder M. The Economics of German Unification after Twenty-five Years: Lessons for Korea. Sonderforschungsbereich 649, Humboldt University, Berlin, Germany; 2017. SFB649DP2017-009. Available from: https://ideas.repec.org/p/hum/wpaper/sfb649dp2017-009.html.

25. Zawilska-Florczuk M, Ciechanowicz A. One country, two societies? Germany twenty years after reunification. Centre for Eastern Studies; 2011. Available from: https://www.osw.waw.pl/en/publikacje/osw-studies/2011-02-15/one-country-two-societies-germany-twenty-years-after-reunification.

26. Watanabe S. A Widely Applicable Bayesian Information Criterion. J Mach Learn Res. 2013;14(1):867–897.

27. Gelman A, Hwang J, Vehtari A. Understanding predictive information criteria for Bayesian models. Statistics and Computing. 2014;24(6):997–1016. doi: 10.1007/s11222-013-9416-2

28. Dawid AP, Sebastiani P. Coherent dispersion criteria for optimal experimental design. The Annals of Statistics. 1999;27(1):65–81.

29. Ver Hoef JM, Boveng PL. Quasi-Poisson vs. negative binomial regression: how should we model overdispersed count data? Ecology. 2007;88(11):2766–2772. doi: 10.1890/07-0043.1 18051645

30. Lawson AB. Bayesian Disease Mapping: Hierarchical Modeling in Spatial Epidemiology. 3rd ed. Chapman & Hall/CRC Interdisciplinary Statistics. New York: Chapman and Hall/CRC; 2018.

31. Banerjee S, Haining RP, Lawson AB, Ugarte MD. Handbook of Spatial Epidemiology. 1st ed. Chapman & Hall/CRC handbooks of modern statistical methods. New York: Chapman and Hall/CRC; 2016.

32. Manitz J, Kneib T, Schlather M, Helbing D, Brockmann D. Origin Detection During Food-borne Disease Outbreaks – A Case Study of the 2011 EHEC/HUS Outbreak in Germany. PLOS Currents Outbreaks. 2014 doi: 10.1371/currents.outbreaks.f3fdeb08c5b9de7c09ed9cbcef5f01f2

33. Meyer S, Held L. Power-law models for infectious disease spread. Annals of Applied Statistics. 2014;8(3):1612–1639. doi: 10.1214/14-AOAS743

34. Pipa G. Analyzing tweets to predict flu epidemics; 2017. Available from: https://www.ibm.com/blogs/client-voices/analyzing-tweets-predict-flu-epidemics/.

Článek vyšel v časopise


2019 Číslo 12
Nejčtenější tento týden