Reconstruction error based deep neural networks for coronary heart disease risk prediction


Autoři: Tsatsral Amarbayasgalan aff001;  Kwang Ho Park aff001;  Jong Yun Lee aff001;  Keun Ho Ryu aff002
Působiště autorů: Database and Bioinformatics Laboratory, School of Electrical and Computer Engineering, Chungbuk National University, Cheongju, Korea aff001;  Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam aff002;  College of Electrical and Computer Engineering, Chungbuk National University, Cheongju, Korea aff003
Vyšlo v časopise: PLoS ONE 14(12)
Kategorie: Research Article
doi: 10.1371/journal.pone.0225991

Souhrn

Coronary heart disease (CHD) is one of the leading causes of death worldwide; if suffering from CHD and being in its end-stage, the most advanced treatments are required, such as heart surgery and heart transplant. Moreover, it is not easy to diagnose CHD at the earlier stage; hospitals diagnose it based on various types of medical tests. Thus, by predicting high-risk people who are to suffer from CHD, it is significant to reduce the risks of developing CHD. In recent years, some research works have been done using data mining to predict the risk of developing diseases based on medical tests. In this study, we have proposed a reconstruction error (RE) based deep neural networks (DNNs); this approach uses a deep autoencoder (AE) model for estimating RE. Initially, a training dataset is divided into two groups by their RE divergence on the deep AE model that learned from the whole training dataset. Next, two DNN classifiers are trained on each group of datasets separately by combining a RE based new feature with other risk factors to predict the risk of developing CHD. For creating the new feature, we use deep AE model that trained on the only high-risk dataset. We have performed an experiment to prove how the components of our proposed method work together more efficiently. As a result of our experiment, the performance measurements include accuracy, precision, recall, F-measure, and AUC score reached 86.3371%, 91.3716%, 82.9024%, 86.9148%, and 86.6568%, respectively. These results show that the proposed AE-DNNs outperformed regular machine learning-based classifiers for CHD risk prediction.

Klíčová slova:

Algorithms – Coronary heart disease – Cholesterol – Machine learning – Machine learning algorithms – Neural networks – Support vector machines


Zdroje

1. World Health Organization (WHO): Cardiovascular diseases (CVDs); 2017. Available from: http://www.who.int/news-room/factsheets/detail/cardiovascular-diseases-(cvds).

2. Statistics Korea: Causes of Death Statistics in 2017; 2017. Available from: http://kostat.go.kr/portal/eng/pressReleases/1/index.board.

3. Ornish D, Scherwitz LW, Billings JH, Gould KL, Merritt TA, Sparler S, et al. Intensive lifestyle changes for reversal of coronary heart disease. Jama. 1998;280(23):2001–2007. doi: 10.1001/jama.280.23.2001 9863851

4. Hu FB, Stampfer MJ, Manson JE, Grodstein F, Colditz GA, Speizer FE, et al. Trends in the incidence of coronary heart disease and changes in diet and lifestyle in women. New England Journal of Medicine. 2000;343(8):530–537. doi: 10.1056/NEJM200008243430802 10954760

5. Hausmann H, Topp H, Siniawski H, Holz S, Hetzer R. Decision-making in end-stage coronary artery disease: revascularization or heart transplantation? The Annals of thoracic surgery. 1997;64(5):1296–1302. doi: 10.1016/S0003-4975(97)00805-9 9386693

6. Ryu KS, Park HW, Park SH, Shon HS, Ryu KH, Lee DG, et al. Comparison of clinical outcomes between culprit vessel only and multivessel percutaneous coronary intervention for ST-segment elevation myocardial infarction patients with multivessel coronary diseases. Journal of geriatric cardiology: JGC. 2015;12(3):208. doi: 10.11909/j.issn.1671-5411.2015.03.014 26089843

7. Diamond GA, Forrester JS. Analysis of probability as an aid in the clinical diagnosis of coronary-artery disease. New England Journal of Medicine. 1979;300(24):1350–1358. doi: 10.1056/NEJM197906143002402 440357

8. Ryu KS, Bae JW, Jeong MH, Cho MC, Ryu KH, Investigators KAMIR, et al. Risk Scoring System for Prognosis Estimation of Multivessel Disease Among Patients with ST-Segment Elevation Myocardial Infarction. International heart journal. 2019;60(3):708–714. doi: 10.1536/ihj.17-337 31105140

9. Heart Foundation: What is coronary heart disease?;. Available from: https://www.heartfoundation.org.au/your-heart/heart-conditions/what-is-coronary-heart-disease.

10. Kurt I, Ture M, Kurum AT. Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease. Expert systems with applications. 2008;34(1):366–374. doi: 10.1016/j.eswa.2006.09.004

11. Kim H, Ishag M, Piao M, Kwon T, Ryu K. A data mining approach for cardiovascular disease diagnosis using heart rate variability and images of carotid arteries. Symmetry. 2016;8(6):47. doi: 10.3390/sym8060047

12. Soni J, Ansari U, Sharma D, Soni S. Predictive data mining for medical diagnosis: An overview of heart disease prediction. International Journal of Computer Applications. 2011;17(8):43–48. doi: 10.5120/2237-2860

13. Lee HG, Kim WS, Noh KY, Shin JH, Yun U, Ryu KH. Coronary artery disease prediction method using linear and nonlinear feature of heart rate variability in three recumbent postures. Information Systems Frontiers. 2009;11(4):419–431. doi: 10.1007/s10796-009-9155-2

14. Atkov OY, Gorokhova SG, Sboev AG, Generozov EV, Muraseyeva EV, Moroshkina SY, et al. Coronary heart disease diagnosis by artificial neural networks including genetic polymorphisms and clinical parameters. Journal of cardiology. 2012;59(2):190–194. doi: 10.1016/j.jjcc.2011.11.005 22218324

15. El-Bialy R, Salamay M, Karam O, Khalifa M. Feature analysis of coronary artery heart disease data sets. Procedia Comput Sci 65: 459–468. Go to original source. 2015. doi: 10.1016/j.procs.2015.09.132

16. Lim K, Lee BM, Kang U, Lee Y. An optimized DBN-based coronary heart disease risk prediction. International Journal of Computers Communications & Control. 2018;13(4):492–502. doi: 10.15837/ijccc.2018.4.3269

17. Pang-Ning Tan K V Steinbach Micheal. Introduction to data mining. India: Pearson Education; 2007.

18. Kim J, Lee J, Lee Y. Data-mining-based coronary heart disease risk prediction model using fuzzy logic and decision tree. Healthcare informatics research. 2015;21(3):167–174. doi: 10.4258/hir.2015.21.3.167 26279953

19. Kim JK, Kang S. Neural network-based coronary heart disease risk prediction using feature correlation analysis. Journal of healthcare engineering. 2017;2017. doi: 10.1155/2017/2780501

20. Kim J, Kang U, Lee Y. Statistics and deep belief network-based cardiovascular risk prediction. Healthcare informatics research. 2017;23(3):169–175. doi: 10.4258/hir.2017.23.3.169 28875051

21. Zong B, Song Q, Min MR, Cheng W, Lumezanu C, Cho D, et al. Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In: Proceedings of the sixth International Conference on Learning Representations. Vancouver, Canada; 2018.

22. Kim H, Hirose A. Unsupervised fine land classification using quaternion autoencoder-based polarization feature extraction and self-organizing mapping. IEEE Transactions on Geoscience and Remote Sensing. 2017;56(3):1839–1851. doi: 10.1109/TGRS.2017.2768619

23. Pochet N, De Smet F, Suykens JA, De Moor BL. Systematic benchmarking of microarray data classification: assessing the role of non-linearity and dimensionality reduction. Bioinformatics. 2004;20(17):3185–3195. doi: 10.1093/bioinformatics/bth383 15231531

24. Amarbayasgalan T, Jargalsaikhan B, Ryu K. Unsupervised novelty detection using deep autoencoders with density based clustering. Applied Sciences. 2018;8(9):1468. doi: 10.3390/app8091468

25. Liou CY, Cheng WC, Liou JW, Liou DR. Autoencoder for words. Neurocomputing. 2014;139:84–96. doi: 10.1016/j.neucom.2013.09.055

26. Guo Y, Liu Y, Oerlemans A, Lao S, Wu S, Lew MS. Deep learning for visual understanding: A review. Neurocomputing. 2016;187:27–48. doi: 10.1016/j.neucom.2015.09.116

27. Gevrey M, Dimopoulos I, Lek S. Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecological modelling. 2003;160(3):249–264. doi: 10.1016/S0304-3800(02)00257-0

28. Korea Centers for Disease Control and Prevention: Korea National Health and Nutrition Examination Survey (KNHANES-V, VI);. Available from: https://knhanes.cdc.go.kr/knhanes/main.do.

29. Heo BM, Ryu KH. Prediction of Prehypertenison and Hypertension Based on Anthropometry, Blood Parameters, and Spirometry. International journal of environmental research and public health. 2018;15(11):2571. doi: 10.3390/ijerph15112571

30. Kim Y. The Korea National Health and nutrition examination survey (KNHANES): current status and challenges. Epidemiology and health. 2014;36. doi: 10.4178/epih/e2014002

31. Greenland P, LaBree L, Azen SP, Doherty TM, Detrano RC. Coronary artery calcium score combined with Framingham score for risk prediction in asymptomatic individuals. Jama. 2004;291(2):210–215. doi: 10.1001/jama.291.2.210 14722147

32. Anyanwu MN, Shiva SG. Comparative analysis of serial decision tree classification algorithms. International Journal of Computer Science and Security. 2009;3(3):230–240.

33. Breiman L. Random forests. Machine learning. 2001;45(1):5–32. doi: 10.1023/A:1010933404324

34. Zaki MJ, Meira W Jr, Meira W. Data mining and analysis: fundamental concepts and algorithms. Cambridge University Press; 2014.

35. Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv preprint arXiv:14126980. 2014.

36. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143(1):29–36. doi: 10.1148/radiology.143.1.7063747 7063747


Článek vyšel v časopise

PLOS One


2019 Číslo 12