Glycemic-aware metrics and oversampling techniques for predicting blood glucose levels using machine learning

Autoři: Michael Mayo aff001;  Lynne Chepulis aff002;  Ryan G. Paul aff002
Působiště autorů: Department of Computer Science, University of Waikato, Hamilton, New Zealand aff001;  Waikato Medical Research Center, University of Waikato, Hamilton, New Zealand aff002;  Waikato Regional Diabetes Service, University of Waikato, Hamilton, New Zealand aff003
Vyšlo v časopise: PLoS ONE 14(12)
Kategorie: Research Article
doi: 10.1371/journal.pone.0225613


Techniques using machine learning for short term blood glucose level prediction in patients with Type 1 Diabetes are investigated. This problem is significant for the development of effective artificial pancreas technology so accurate alerts (e.g. hypoglycemia alarms) and other forecasts can be generated. It is shown that two factors must be considered when selecting the best machine learning technique for blood glucose level regression: (i) the regression model performance metrics being used to select the model, and (ii) the preprocessing techniques required to account for the imbalanced time spent by patients in different portions of the glycemic range. Using standard benchmark data, it is demonstrated that different regression model/preprocessing technique combinations exhibit different accuracies depending on the glycemic subrange under consideration. Therefore technique selection depends on the type of alert required. Specific findings are that a linear Support Vector Regression-based model, trained with normal as well as polynomial features, is best for blood glucose level forecasting in the normal and hyperglycemic ranges while a Multilayer Perceptron trained on oversampled data is ideal for predictions in the hypoglycemic range.

Klíčová slova:

Blood sugar – Decision tree learning – Decision trees – Hypoglycemia – Hypoglycemics – Machine learning – Machine learning algorithms – Polynomials


1. DiMeglio LA, Evans-Molina C, Oram RA. Type 1 diabetes. The Lancet. 2018;391(10138):2449–2462. doi: 10.1016/S0140-6736(18)31320-5

2. Lind M, Polonsky W, Hirsch IB, Heise T, Bolinder J, Dahlqvist S, et al. Continuous Glucose Monitoring vs Conventional Therapy for Glycemic Control in Adults With Type 1 Diabetes Treated With Multiple Daily Insulin Injections: The GOLD Randomized Clinical Trial. JAMA. 2017;317(4):379–387. doi: 10.1001/jama.2016.19976 28118454

3. Garg SK, Weinzimer SA, Tamborlane WV, Buckingham BA, Bode BW, Bailey TS, et al. Glucose Outcomes with the In-Home Use of a Hybrid Closed-Loop Insulin Delivery System in Adolescents and Adults with Type 1 Diabetes. Diabetes Technology & Therapeutics. 2017;19(3):155–163. doi: 10.1089/dia.2016.0421

4. Miller KM, Foster NC, Beck RW, Bergenstal RM, DuBose SN, DiMeglio LA, et al. Current State of Type 1 Diabetes Treatment in the U.S.: Updated Data From the T1D Exchange Clinic Registry. Diabetes Care. 2015;38(6):971–978. doi: 10.2337/dc15-0078 25998289

5. Rama Chandran S, Tay WL, Lye WK, Lim LL, Ratnasingam J, Tan ATB, et al. Beyond HbA1c: Comparing Glycemic Variability and Glycemic Indices in Predicting Hypoglycemia in Type 1 and Type 2 Diabetes. Diabetes Technology & Therapeutics. 2018;20(5):353–362. doi: 10.1089/dia.2017.0388

6. Abraham MB, Nicholas JA, Smith GJ, Fairchild JM, King BR, Ambler GR, et al. Reduction in Hypoglycemia With the Predictive Low-Glucose Management System: A Long-Term Randomized Controlled Trial in Adolescents With Type 1 Diabetes. Diabetes Care. 2017. doi: 10.2337/dc17-1604 29191844

7. Del Favero S, Place J, Kropff J, Messori M, Keith-Hynes P, Visentin R, et al. Multicenter outpatient dinner/overnight reduction of hypoglycemia and increased time of glucose in target with a wearable artificial pancreas using modular model predictive control in adults with type 1 diabetes. Diabetes, Obesity and Metabolism. 2015;17(5):468–476. doi: 10.1111/dom.12440 25600304

8. Bruen D, Delaney C, Florea L, Diamond D. Glucose sensing for diabetes monitoring: recent developments. Sensors. 2017;17(8):1866. doi: 10.3390/s17081866

9. Kovatchev B. Automated closed-loop control of diabetes: the artificial pancreas. Bioelectronic Medicine. 2018;4(1):14. doi: 10.1186/s42234-018-0015-6

10. Oviedo S, Vehí J, Calm R, Armengol J. A review of personalized blood glucose prediction strategies for T1DM patients. International Journal for Numerical Methods in Biomedical Engineering. 2017;33(6):e2833. doi: 10.1002/cnm.2833

11. Woldaregay AZ, Årsand E, Walderhaug S, Albers D, Mamykina L, Botsis T, et al. Data-driven modeling and prediction of blood glucose dynamics: Machine learning applications in type 1 diabetes. Artificial Intelligence in Medicine. 2019;98:109–134. 31383477

12. Marling C, Bunescu R. Benchmark Machine Learning Approaches with Classical Time Series Approaches on the Blood Glucose Level Prediction Challenge. In: Proc. of the 3rd International Workshop on Knowledge Discovery in Healthcare Data; 2018. p. 97–102.

13. Branco P, Torgo L, Ribeiro RP. A Survey of Predictive Modeling on Imbalanced Domains. ACM Comput Surv. 2016;49(2):31:1–31:50. doi: 10.1145/2907070

14. Marling C, Bunescu R. The OhioT1DM Dataset For Blood Glucose Level Prediction. In: Proc. of the 3rd International Workshop on Knowledge Discovery in Healthcare Data; 2018. p. 60–63.

15. Danne T, Nimri R, Battelino T, Bergenstal RM, Close KL, DeVries JH, et al. International Consensus on Use of Continuous Glucose Monitoring. Diabetes Care. 2017;40(12):1631–1640. doi: 10.2337/dc17-1600 29162583

16. Friedman J, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, Articles. 2010;33(1):1–22.

17. Kim SJ, Koh K, Lustig M, Boyd S, Gorinevsky D. An Interior-Point Method for Large-Scale L1-Regularized Least Squares. IEEE Journal of Selected Topics in Signal Processing. 2007;1(4):606–617. doi: 10.1109/JSTSP.2007.910971

18. Smola AJ, Schölkopf B. A tutorial on support vector regression; 2004.

19. Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and Regression Trees. Monterey, CA: Wadsworth and Brooks; 1984.

20. Schölkopf B, Smola AJ, Williamson RC, Bartlett PL. New Support Vector Algorithms. Neural Comput. 2000;12(5):1207–1245. doi: 10.1162/089976600300015565 10905814

21. Kingma DP, Ba J. ADAM: A method for stochastic optimization. arXiv preprint arXiv:14126980. 2014.

22. Breiman L. Random forests. Machine Learning. 2001;45(1):5–32. doi: 10.1023/A:1010933404324

23. Friedman JH. Stochastic Gradient Boosting. Comput Stat Data Anal. 2002;38(4):367–378. doi: 10.1016/S0167-9473(01)00065-2

24. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic Minority Over-sampling Technique. J Artif Int Res. 2002;16(1):321–357.

25. He H, Bai Y, Garcia EA, Li S. ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence); 2008. p. 1322–1328.

26. Zhu T, Li K, Herrero P, Chen J, Georgiou P. A Deep Learning Algorithm for Personalized Blood Glucose Prediction. In: Proc. of the 3rd International Workshop on Knowledge Discovery in Healthcare Data; 2018. p. 64–78.

27. Clarke WL, Cox D, Gonder-Frederick LA, Carter W, Pohl SL. Evaluating clinical accuracy of systems for self-monitoring of blood glucose. Diabetes care. 1987;10(5):622–628. doi: 10.2337/diacare.10.5.622 3677983

28. Clarke error grid analysis

29. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research. 2011;12:2825–2830.

30. Lemaître G, Nogueira F, Aridas CK. Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning. Journal of Machine Learning Research. 2017;18(17):1–5.

31. Hollander M, Wolfe D, Chicken E. Nonparametric Statistical Methods, 3rd Edition; 2015.

32. Sachs L. Angewandte Statistik, 8th Ed.; 1997.

Článek vyšel v časopise


2019 Číslo 12