Development and verification of prediction models for preventing cardiovascular diseases
Autoři:
Ji Min Sung aff001; In-Jeong Cho aff002; David Sung aff003; Sunhee Kim aff004; Hyeon Chang Kim aff005; Myeong-Hun Chae aff007; Maryam Kavousi aff008; Oscar L. Rueda-Ochoa aff008; M. Arfan Ikram aff008; Oscar H. Franco aff008; Hyuk-Jae Chang aff005
Působiště autorů:
Integrative Research Center for Cerebrovascular and Cardiovascular diseases, Yonsei University College of Medicine, Yonsei University Health System, Seoul, Korea
aff001; Division of Cardiology, Ewha University College of Medicine, Seoul, Korea
aff002; Data Science Team of KT NexR, Seoul, Korea
aff003; Yonsei University College of Medicine, Yonsei University Health System, Seoul, Korea
aff004; Division of Cardiology, Severance Cardiovascular Hospital, Yonsei University College of Medicine, Seoul, Korea
aff005; Department of Preventive Medicine, Yonsei University College of Medicine, Seoul, Korea
aff006; AI R&D Lab. of Selvas AI Inc., Seoul, Korea
aff007; Department of Epidemiology, Erasmus MC, Rotterdam, the Netherlands
aff008; School of Medicine, Faculty of Health, Universidad Industrial de Santander UIS, Bucaramanga, Colombia
aff009; Department of Radiology, Erasmus MC, Rotterdam, the Netherlands
aff010; Severance Biomedical Science Institute, Yonsei University College of Medicine, Seoul, Korea
aff011
Vyšlo v časopise:
PLoS ONE 14(9)
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pone.0222809
Souhrn
Objectives
Cardiovascular disease (CVD) is one of the major causes of death worldwide. For improved accuracy of CVD prediction, risk classification was performed using national time-series health examination data. The data offers an opportunity to access deep learning (RNN-LSTM), which is widely known as an outstanding algorithm for analyzing time-series datasets. The objective of this study was to show the improved accuracy of deep learning by comparing the performance of a Cox hazard regression and RNN-LSTM based on survival analysis.
Methods and findings
We selected 361,239 subjects (age 40 to 79 years) with more than two health examination records from 2002–2006 using the National Health Insurance System-National Health Screening Cohort (NHIS-HEALS). The average number of health screenings (from 2002–2013) used in the analysis was 2.9 ± 1.0. Two CVD prediction models were developed from the NHIS-HEALS data: a Cox hazard regression model and a deep learning model. In an internal validation of the NHIS-HEALS dataset, the Cox regression model showed a highest time-dependent area under the curve (AUC) of 0.79 (95% CI 0.70 to 0.87) for in females and 0.75 (95% CI 0.70 to 0.80) in males at 2 years. The deep learning model showed a highest time-dependent AUC of 0.94 (95% CI 0.91 to 0.97) for in females and 0.96 (95% CI 0.95 to 0.97) in males at 2 years. Layer-wise Relevance Propagation (LRP) revealed that age was the variable that had the greatest effect on CVD, followed by systolic blood pressure (SBP) and diastolic blood pressure (DBP), in that order.
Conclusion
The performance of the deep learning model for predicting CVD occurrences was better than that of the Cox regression model. In addition, it was confirmed that the known risk factors shown to be important by previous clinical studies were extracted from the study results using LRP.
Klíčová slova:
Physical sciences – Research and analysis methods – Computer and information sciences – Mathematics – Simulation and modeling – Medicine and health sciences – Endocrinology – Endocrine disorders – Metabolic disorders – Statistics – Mathematical and statistical techniques – Statistical methods – Public and occupational health – Applied mathematics – Algorithms – Vascular medicine – Health screening – Cardiovascular medicine – Blood pressure – Artificial intelligence – Machine learning – Epidemiology – Medical risk factors – Cardiovascular diseases – Deep learning
Zdroje
1. Ezzati M, Vander Hoorn S, Lawes CM, Leach R, James WP, Lopez AD, Rodgers A, Murray CJ. Rethinking the "diseases of affluence" paradigm: global patterns of nutritional risks in relation to economic development. PLoS Med. 2005 May;2(5):e133. doi: 10.1371/journal.pmed.0020133 15916467
2. Conroy RM, Pyorala K, Fitzgerald AP, Sans S, Menotti A, De Backer G, De Bacquer D, Ducimetiere P, Jousilahti P, Keil U, Njolstad I, Oganov RG, Thomsen T, Tunstall-Pedoe H, Tverdal A, Wedel H, Whincup P, Wilhelmsen L, Graham IM. Estimation of ten-year risk of fatal cardiovascular disease in Europe: the SCORE project. Eur Heart J. 2003;24:987–1003. doi: 10.1016/s0195-668x(03)00114-3 12788299
3. Hippisley-Cox J, Coupland C, Vinogradova Y, Robson J, May M, Brindle P. Derivation and validation of QRISK, a new cardiovascular disease risk score for the United Kingdom: prospective open cohort study. Bmj. 2007;335:136. doi: 10.1136/bmj.39261.471806.55 17615182
4. D’Agostino RB Sr., Grundy S, Sullivan LM, Wilson P. Validation of the Framingham coronary heart disease prediction scores: results of a multiple ethnic groups investigation. Jama. 2001;286:180–187. doi: 10.1001/jama.286.2.180 11448281
5. Lloyd-Jones DM, Leip EP, Larson MG, D’Agostino RB, Beiser A, Wilson PW, Wolf PA, Levy D. Prediction of lifetime risk for cardiovascular disease by risk factor burden at 50 years of age. Circulation. 2006;113:791–798. doi: 10.1161/CIRCULATIONAHA.105.548206 16461820
6. Pencina MJ, D’Agostino RB Sr., Larson MG, Massaro JM, Vasan RS. Predicting the 30-year risk of cardiovascular disease: the framingham heart study. Circulation. 2009;119:3078–3084. doi: 10.1161/CIRCULATIONAHA.108.816694 19506114
7. Wilson PW, D’Agostino RB, Levy D, Belanger AM, Silbershatz H, Kannel WB. Prediction of coronary heart disease using risk factor categories. Circulation. 1998;97:1837–1847. doi: 10.1161/01.cir.97.18.1837 9603539
8. Goldstein BA, Navar AM, Carter RE. Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges. Eur Heart J. 2016;19.
9. Waljee AK, Higgins PD. Machine learning in medicine: a primer for physicians. Am J Gastroenterol. 2010;105:1224–1226. doi: 10.1038/ajg.2010.173 20523307
10. Deo RC. Machine Learning in Medicine. Circulation. 2015;132:1920–1930. doi: 10.1161/CIRCULATIONAHA.115.001593 26572668
11. Dean J, Corrado G, Monga R, Chen K, Devin M, Mao M, Senior A, Tucker P, Yang K, Le QV. Large scale distributed deep networks. In. Advances in neural information processing systems2012:1223–1231.
12. Hinton G, Deng L, Yu D, Dahl GE, Mohamed A-r, Jaitly N, Senior A, Vanhoucke V, Nguyen, Sainath TN. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine. 2012;29:82–97.
13. Narain R, Saxena S, Goyal AK. Cardiovascular risk prediction: a comparative study of Framingham and quantum neural network based approach. Patient Prefer Adherence. 2016;10:1259–1270. doi: 10.2147/PPA.S108203 27486312
14. Khatibi V, Montazer GA. A fuzzy-evidential hybrid inference engine for coronary heart disease risk assessment. Expert Systems with Applications. 2010;37:8536–8542.
15. Kukar M, Kononenko I, Grošelj C, Kralj K, Fettich J. Analysing and improving the diagnosis of ischaemic heart disease with machine learning. Artificial intelligence in medicine. 1999;16:25–50. 10225345
16. Seong SC, Kim YY, Park SK, et al. Cohort profile: the National Health Insurance Service-National Health Screening Cohort (NHIS-HEALS) in Korea. BMJ Open 2017;7:e016640. doi: 10.1136/bmjopen-2017-016640 28947447
17. Hofman A, Brusselle GG, Darwish Murad S, et al. The Rotterdam Study: 2016 objectives and design update. Eur J Epidemiol 2015;30:661–708. doi: 10.1007/s10654-015-0082-x 26386597
18. Street, W. N. (1998, July). A Neural Network Model for Prognostic Prediction. In ICML (pp. 540–546).
19. Baesens B., Van Gestel T., Stepanova M., Van den Poel D., & Vanthienen J. (2005). Neural network survival analysis for personal loan data. Journal of the Operational Research Society, 56(9), 1089–1098.,
20. Chi, C. L., Street, W. N., & Wolberg, W. H. (2007). Application of artificial neural network-based survival analysis on two breast cancer datasets. In AMIA Annual Symposium Proceedings (Vol. 2007, p. 130). American Medical Informatics Association.
21. Dezfouli, H. N., & Bakar, M. R. A. (2012, September). Feed forward neural networks models for survival analysis. In Statistics in Science, Business, and Engineering (ICSSBE), 2012 International Conference on (pp. 1–5). IEEE).
22. Van Buuren S. Multiple imputation of discrete and continuous data by fully conditional specification. Statistical methods in medical research 2007;16:219–42. doi: 10.1177/0962280206074463 17621469
23. SAS INSTITUTE INC. SAS/STAT® 14.1 User’s Guide. The MI Procedure. 2015.
24. Mosca L, Barrett-Connor E, Wenger NK. Sex/gender differences in cardiovascular disease prevention: what a difference a decade makes. Circulation. 2011;124:2145–2154. doi: 10.1161/CIRCULATIONAHA.110.968792 22064958.
25. Cho IJ, Sung JM, Chang HJ, et al. Incremental Value of Repeated Risk Factor Measurements for Cardiovascular Disease Prediction in Middle-Aged Korean Adults: Results From the NHIS-HEALS (National Health Insurance System-National Health Screening Cohort). Circ Cardiovasc Qual Outcomes 2017;10:004197.
26. Hochreiter S, Schmidhuber J. Long short-term memory. Neural computation 1997;9:1735–80. 9377276
27. Tieleman T, Hinton G. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning. 2012;4.
28. Harrell FE Jr, Califf RM, Pryor DB, Lee KL, Rosati RA. Evaluating the yield of medical tests. JAMA.1982;247(18):2543–2546. 7069920
29. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143(1):29–36. doi: 10.1148/radiology.143.1.7063747 7063747
30. Bach Sebastian, et al. "On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation." PloS one 10.7 (2015): e0130140. doi: 10.1371/journal.pone.0130140 26161953
31. Ras Gabriëlle, van Gerven Marcel, and Haselager Pim. "Explanation methods in deep learning: Users, values, concerns and challenges." Explainable and Interpretable Models in Computer Vision and Machine Learning. Springer, Cham, 2018. 19–36.
32. Arras Leila, et al. "Explaining Recurrent Neural Network Predictions in Sentiment Analysis." EMNLP 2017 (2017): 159.
33. Jarrett D, Yoon J, van der Schaar M. Dynamic Prediction in Clinical Survival Analysis using Temporal Convolutional Networks. IEEE J Biomed Health Inform. 2019.
34. Wang T, Qiu RG, Yu M. Predictive Modeling of the Progression of Alzheimer’s Disease with Recurrent Neural Networks. Sci Rep. 2018; 8: 9161 doi: 10.1038/s41598-018-27337-w
35. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015 May 28;521(7553):436–444. doi: 10.1038/nature14539
36. Min S, Lee B, Yoon S. Deep learning in bioinformatics. Brief Bioinform 2017 Sep 1;18(5):851–869. doi: 10.1093/bib/bbw068 27473064
37. Ruwanpathirana T, Owen A, Reid CM. Review on Cardiovascular Risk Prediction. Cardiovasc Ther. 2015 Apr; 33(2):62–70. doi: 10.1111/1755-5922.12110 25758853
38. Vikulova DN, Grubisic M, et al. Premature Atherosclerotic Cardiovascular Disease: Trends in Incidence, Risk Factors, and Sex-Related Differences, 2000 to 2016. J Am Heart Assoc. 2019 Jul 16; 8(14):e012178. doi: 10.1161/JAHA.119.012178 31280642
39. Ambale-Venkatesh B, Yang X, et al. Cardiovascular Event Prediction by Machine Learning The Multi-Ethnic Study of Atherosclerosis. Circ Res. 2017 Oct 13;121(9):1092–1101. doi: 10.1161/CIRCRESAHA.117.311312
Článok vyšiel v časopise
PLOS One
2019 Číslo 9
- Metamizol jako analgetikum první volby: kdy, pro koho, jak a proč?
- Nejasný stín na plicích – kazuistika
- Masturbační chování žen v ČR − dotazníková studie
- Těžké menstruační krvácení může značit poruchu krevní srážlivosti. Jaký management vyšetření a léčby je v takovém případě vhodný?
- Fixní kombinace paracetamol/kodein nabízí synergické analgetické účinky
Najčítanejšie v tomto čísle
- Graviola (Annona muricata) attenuates behavioural alterations and testicular oxidative stress induced by streptozotocin in diabetic rats
- CH(II), a cerebroprotein hydrolysate, exhibits potential neuro-protective effect on Alzheimer’s disease
- Comparison between Aptima Assays (Hologic) and the Allplex STI Essential Assay (Seegene) for the diagnosis of Sexually transmitted infections
- Assessment of glucose-6-phosphate dehydrogenase activity using CareStart G6PD rapid diagnostic test and associated genetic variants in Plasmodium vivax malaria endemic setting in Mauritania