KETOS: Clinical decision support and machine learning as a service – A training and deployment platform based on Docker, OMOP-CDM, and FHIR Web Services

Autoři: Julian Gruendner aff001;  Thorsten Schwachhofer aff001;  Phillip Sippl aff001;  Nicolas Wolf aff001;  Marcel Erpenbeck aff001;  Christian Gulden aff001;  Lorenz A. Kapsner aff002;  Jakob Zierk aff002;  Sebastian Mate aff002;  Michael Stürzl aff004;  Roland Croner aff005;  Hans-Ulrich Prokosch aff001;  Dennis Toddenroth aff001
Působiště autorů: Chair of Medical Informatics, Friedrich-Alexander-University Erlangen-Nürnberg (FAU), Erlangen, Germany aff001;  Medical Centre for Information and Communication Technology, Universitätsklinikum Erlangen, Erlangen, Germany aff002;  Department of Pediatrics and Adolescent Medicine, UniversitätsklinikumErlangen, Erlangen, Germany aff003;  Department of Surgery, Division of Molecular and Experimental Surgery, Friedrich-Alexander-University Erlangen-Nürnberg (FAU), Erlangen, Germany aff004;  Department of General, Visceral, Vascular and Graft Surgery, University Hospital, Magdeburg, Germany aff005
Vyšlo v časopise: PLoS ONE 14(10)
Kategorie: Research Article


Background and objective

To take full advantage of decision support, machine learning, and patient-level prediction models, it is important that models are not only created, but also deployed in a clinical setting. The KETOS platform demonstrated in this work implements a tool for researchers allowing them to perform statistical analyses and deploy resulting models in a secure environment.


The proposed system uses Docker virtualization to provide researchers with reproducible data analysis and development environments, accessible via Jupyter Notebook, to perform statistical analysis and develop, train and deploy models based on standardized input data. The platform is built in a modular fashion and interfaces with web services using the Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) standard to access patient data. In our prototypical implementation we use an OMOP common data model (OMOP-CDM) database. The architecture supports the entire research lifecycle from creating a data analysis environment, retrieving data, and training to final deployment in a hospital setting.


We evaluated the platform by establishing and deploying an analysis and end user application for hemoglobin reference intervals within the University Hospital Erlangen. To demonstrate the potential of the system to deploy arbitrary models, we loaded a colorectal cancer dataset into an OMOP database and built machine learning models to predict patient outcomes and made them available via a web service. We demonstrated both the integration with FHIR as well as an example end user application. Finally, we integrated the platform with the open source DataSHIELD architecture to allow for distributed privacy preserving data analysis and training across networks of hospitals.


The KETOS platform takes a novel approach to data analysis, training and deploying decision support models in a hospital or healthcare setting. It does so in a secure and privacy-preserving manner, combining the flexibility of Docker virtualization with the advantages of standardized vocabularies, a widely applied database schema (OMOP-CDM), and a standardized way to exchange medical data (FHIR).

Klíčová slova:

Hemoglobin – Machine learning – Machine learning algorithms – Physicians – Preprocessing – Prototypes – Statistical data – Consortia


1. Wehling M. Translational medicine: science or wishful thinking? J Transl Med. 2008;6:31. Epub 2008/06/19. doi: 10.1186/1479-5876-6-31 18559092; PubMed Central PMCID: PMC2442586.

2. Glasgow RE, Emmons KM. How can we increase translation of research into practice? Types of evidence needed. Annu Rev Public Health. 2007;28:413–33. Epub 2006/12/08. doi: 10.1146/annurev.publhealth.28.021406.144145 17150029.

3. Horig H, Marincola E, Marincola FM. Obstacles and opportunities in translational research. Nat Med. 2005;11(7):705–8. Epub 2005/07/15. doi: 10.1038/nm0705-705 16015353.

4. Soto GES J. A. EPOCH and ePRISM: A web-based translational framework for bridging outcomes research and clinical practice. Computers in Cardiology. 2007; doi: 10.1109/CIC.2007.4745457:4

5. Velickovski F, Ceccaroni L, Roca J, Burgos F, Galdiz JB, Marina N, et al. Clinical Decision Support Systems (CDSS) for preventive management of COPD patients. J Transl Med. 2014;12 Suppl 2:S9. Epub 2014/12/05. doi: 10.1186/1479-5876-12-S2-S9 25471545; PubMed Central PMCID: PMC4255917.

6. Baldow C, Salentin S, Schroeder M, Roeder I, Glauche I. MAGPIE: Simplifying access and execution of computational models in the life sciences. PLoS Comput Biol. 2017;13(12):e1005898. Epub 2017/12/16. doi: 10.1371/journal.pcbi.1005898 29244826; PubMed Central PMCID: PMC5747461.

7. Gibson E, Li W, Sudre C, Fidon L, Shakir DI, Wang G, et al. NiftyNet: a deep-learning platform for medical imaging. Comput Methods Programs Biomed. 2018;158:113–22. Epub 2018/03/17. doi: 10.1016/j.cmpb.2018.01.025 29544777; PubMed Central PMCID: PMC5869052.

8. Khalilia M, Choi M, Henderson A, Iyengar S, Braunstein M, Sun J. Clinical Predictive Modeling Development and Deployment through FHIR Web Services. AMIA Annu Symp Proc. 2015;2015:717–26. Epub 2015/01/01. 26958207; PubMed Central PMCID: PMC4765683.

9. GT-FHIR: OMOP on FHIR Project [Internet]. 2018 [cited 2019 Sep 3]. Available from:

10. HL7 FHIR [Internet]. [cited 2019 Sep 3]. Available from:

11. Definition and DDLs for the OMOP Common Data Model (CDM) [Internet]. [cited 2019 Sep 3]. Available from:

12. Semler SC, Wissing F, Heyder R. German Medical Informatics Initiative. Methods Inf Med. 2018;57(S 01):e50–e6. Epub 2018/07/18. doi: 10.3414/ME18-03-0003 30016818; PubMed Central PMCID: PMC6178199.

13. Winter A, Staubert S, Ammon D, Aiche S, Beyan O, Bischoff V, et al. Smart Medical Information Technology for Healthcare (SMITH). Methods Inf Med. 2018;57(S 01):e92–e105. Epub 2018/07/18. doi: 10.3414/ME18-02-0004 30016815; PubMed Central PMCID: PMC6193398.

14. Haarbrandt B, Schreiweis B, Rey S, Sax U, Scheithauer S, Rienhoff O, et al. HiGHmed—An Open Platform Approach to Enhance Care and Research across Institutional Boundaries. Methods Inf Med. 2018;57(S 01):e66–e81. Epub 2018/07/18. doi: 10.3414/ME18-02-0002 30016813; PubMed Central PMCID: PMC6193407.

15. Prasser F, Kohlbacher O, Mansmann U, Bauer B, Kuhn KA. Data Integration for Future Medicine (DIFUTURE). Methods Inf Med. 2018;57(S 01):e57–e65. Epub 2018/07/18. doi: 10.3414/ME17-02-0022 30016812; PubMed Central PMCID: PMC6178202.

16. Hripcsak G, Duke JD, Shah NH, Reich CG, Huser V, Schuemie MJ, et al. Observational Health Data Sciences and Informatics (OHDSI): Opportunities for Observational Researchers. Stud Health Technol Inform. 2015;216:574–8. Epub 2015/08/12. 26262116; PubMed Central PMCID: PMC4815923.

17. Hughes N, Rijnbeek P, Van Speybroeck M. The European Health Data and Evidence Network (EHDEN) [Internet]. Liberating Evidence via Harmonisation of EU Real world data. Rotterdam: European ODHSI Symposium [cited 2019 Sep 3]. Available from:

18. Prokosch HU, Acker T, Bernarding J, Binder H, Boeker M, Boerries M, et al. MIRACUM: Medical Informatics in Research and Care in University Medicine. Methods Inf Med. 2018;57(S 01):e82–e91. Epub 2018/07/18. doi: 10.3414/ME17-02-0025 30016814; PubMed Central PMCID: PMC6178200.

19. Project Jupyter. jupyter [Internet]. 2018 [cited 2019 Sep 3]. Available from:

20. Fortier I, Burton PR, Robson PJ, Ferretti V, Little J, L'Heureux F, et al. Quality, quantity and harmony: the DataSHaPER approach to integrating data across bioclinical studies. Int J Epidemiol. 2010;39(5):1383–93. Epub 2010/09/04. doi: 10.1093/ije/dyq139 20813861; PubMed Central PMCID: PMC2972444.

21. Laursen SO. [SNOMED (Systematized Nomenclature of Medicine)—multiaxial data registration]. Ugeskr Laeger. 1981;143(17):1081–3. Epub 1981/04/20. 7245387.

22. Vreeman DJ, McDonald CJ, Huff SM. LOINC(R)—A Universal Catalog of Individual Clinical Observations and Uniform Representation of Enumerated Collections. Int J Funct Inform Personal Med. 2010;3(4):273–91. Epub 2010/01/01. doi: 10.1504/IJFIPM.2010.040211 22899966; PubMed Central PMCID: PMC3418707.

23. Rishi Kanth S. Fast Health Interoperability Resources (FHIR): Current Status in the Healthcare System. International Journal of E-Health and Medical Communications (IJEHMC). 2019;10(1):76–93. doi: 10.4018/IJEHMC.2019010105

24. ATLAS [Internet]. 2018 [cited 2019 Sep 3]. Available from:

25. Grundner J, Prokosch HU, Sturzl M, Croner R, Christoph J, Toddenroth D. Predicting Clinical Outcomes in Colorectal Cancer Using Machine Learning. Stud Health Technol Inform. 2018;247:101–5. Epub 2018/04/22. 29677931.

26. Logemann T. General Data Protection Regulation GDPR [Internet]. [cited 2019 14.01.2019]. Available from:

27. Zierk J, Arzideh F, Haeckel R, Rauh M, Metzler M, Ganslandt T, et al. Indirect determination of hematology reference intervals in adult patients on Beckman Coulter UniCell DxH 800 and Abbott CELL-DYN Sapphire devices. Clin Chem Lab Med. 2018. Epub 2018/10/28. doi: 10.1515/cclm-2018-0771 30367783.

28. Arzideh F. Reference Limit Estimator [Internet]. [cited 2019 Sep 3]. Available from:

29. Povey S, Lovering R, Bruford E, Wright M, Lush M, Wain H. The HUGO Gene Nomenclature Committee (HGNC). Hum Genet. 2001;109(6):678–80. Epub 2002/01/26. doi: 10.1007/s00439-001-0615-0 11810281.

30. Jones BBaMLaLKaJSaJRaESaGCaZM. mlr: Machine Learning in R. Journal of Machine Learning Research. 2016;17(170):1–5.

31. Kuhn M. The caret Package.

32. Gruendner J, Prokosch HU, Schindler S, Lenz S, Binder H. A Queue-Poll Extension and DataSHIELD: Standardised, Monitored, Indirect and Secure Access to Sensitive Data. Stud Health Technol Inform. 2019;258:115–9. Epub 2019/04/04. 30942726.

33. Leisch F, Dimitriadou E. mlbench: Machine Learning Benchmark Problems [Internet]. 2012 [cited 2019 Aug 27]. Available from:

34. Dua D, Graff C. Machine Learning Repository [Internet]. 2019 [cited 2019 Aug 27]. Available from:

35. HL7. HL7 FHIR—Version History [Internet]. 2018 [cited 2019 Aug 13]. Available from:

36. Google. Cloud Healthcare API [Internet]. 2019 [cited 2019 Aug 14]. Available from:

37. Microsoft. Azure API for FHIR. [Internet]. 2019. Available from:

38. Apple. Accessing Health Records [Internet]. 2019 [cited 2019 Aug 14]. Available from:

39. Posnack S, Barker W. Heat Wave: The U.S. is Poised to Catch FHIR in 2019 [Internet]. 2018 [cited 2019 Sep 3 ]. Available from:

40. Medizininformatik-Initiative. Medizininformatik-Initiative beschließt Verwendung von FHIR. [Internet]. 2019 [cited 2019 Aug 17]. Available from:

41. Gaye A, Marcon Y, Isaeva J, LaFlamme P, Turner A, Jones EM, et al. DataSHIELD: taking the analysis to the data, not the data to the analysis. Int J Epidemiol. 2014;43(6):1929–44. Epub 2014/09/30. doi: 10.1093/ije/dyu188 25261970; PubMed Central PMCID: PMC4276062.

42. Maier C, Lang L, Storf H, Vormstein P, Bieber R, Bernarding J, et al. Towards Implementation of OMOP in a German University Hospital Consortium. Appl Clin Inform. 2018;9(1):54–61. Epub 2018/01/25. doi: 10.1055/s-0037-1617452 29365340; PubMed Central PMCID: PMC5801887.

43. Mandel JC, Kreda DA, Mandl KD, Kohane IS, Ramoni RB. SMART on FHIR: a standards-based, interoperable apps platform for electronic health records. J Am Med Inform Assoc. 2016;23(5):899–908. Epub 2016/02/26. doi: 10.1093/jamia/ocv189 26911829; PubMed Central PMCID: PMC4997036.

Článek vyšel v časopise


2019 Číslo 10
Nejčtenější tento týden