DOI QR코드

DOI QR Code

Personalized Diabetes Risk Assessment Through Multifaceted Analysis (PD- RAMA): A Novel Machine Learning Approach to Early Detection and Management of Type 2 Diabetes

  • Gharbi Alshammari (Department of Information and Computer Science, College of Computer Science and Engineering, University of Ha'il)
  • Received : 2023.08.05
  • Published : 2023.08.30

Abstract

The alarming global prevalence of Type 2 Diabetes Mellitus (T2DM) has catalyzed an urgent need for robust, early diagnostic methodologies. This study unveils a pioneering approach to predicting T2DM, employing the Extreme Gradient Boosting (XGBoost) algorithm, renowned for its predictive accuracy and computational efficiency. The investigation harnesses a meticulously curated dataset of 4303 samples, extracted from a comprehensive Chinese research study, scrupulously aligned with the World Health Organization's indicators and standards. The dataset encapsulates a multifaceted spectrum of clinical, demographic, and lifestyle attributes. Through an intricate process of hyperparameter optimization, the XGBoost model exhibited an unparalleled best score, elucidating a distinctive combination of parameters such as a learning rate of 0.1, max depth of 3, 150 estimators, and specific colsample strategies. The model's validation accuracy of 0.957, coupled with a sensitivity of 0.9898 and specificity of 0.8897, underlines its robustness in classifying T2DM. A detailed analysis of the confusion matrix further substantiated the model's diagnostic prowess, with an F1-score of 0.9308, illustrating its balanced performance in true positive and negative classifications. The precision and recall metrics provided nuanced insights into the model's ability to minimize false predictions, thereby enhancing its clinical applicability. The research findings not only underline the remarkable efficacy of XGBoost in T2DM prediction but also contribute to the burgeoning field of machine learning applications in personalized healthcare. By elucidating a novel paradigm that accentuates the synergistic integration of multifaceted clinical parameters, this study fosters a promising avenue for precise early detection, risk stratification, and patient-centric intervention in diabetes care. The research serves as a beacon, inspiring further exploration and innovation in leveraging advanced analytical techniques for transformative impacts on predictive diagnostics and chronic disease management.

Keywords

References

  1. Olusanya, M. O., Ogunsakin, R. E., Ghai, M., & Adeleke, M. A. (2021). Accuracy of Machine Learning Classification Models for the Prediction of Type 2 Diabetes Mellitus: A Systematic Survey and Meta-Analysis Approach. International Journal of Environmental Research and Public Health, 19(21), 14280. https://doi.org/10.3390/ijerph192114280. 
  2. Tabish SA. Is Diabetes Becoming the Biggest Epidemic of the Twenty-first Century? Int J Health Sci (Qassim). 2007 Jul;1(2):V-VIII. PMID: 21475425; PMCID: PMC3068646. 
  3. Chen, X., Wang, Y., & Zhang, H. (2020). Ensemble Learning Methods for Diabetes Prediction: A Comparative Study. International Journal of Healthcare Analytics, 5(2), 145-160. 
  4. Uddin, M. J., Ahamad, M. M., Hoque, M. N., Walid, M. A. A., Aktar, S., Alotaibi, N., Alyami, S. A., Kabir, M. A., & Moni, M. A. (2022). A Comparison of Machine Learning Techniques for the Detection of Type-2 Diabetes Mellitus: Experiences from Bangladesh. 
  5. Iparraguirre-Villanueva, O., Espinola-Linares, K., Flores Castaneda, R. O., & Cabanillas-Carbonell, M. (2022). Application of Machine Learning Models for Early Detection and Accurate Classification of Type 2 Diabetes. 
  6. Olusanya, Micheal O., Ropo Ebenezer Ogunsakin, Meenu Ghai, and Matthew Adekunle Adeleke. 2022. "Accuracy of Machine Learning Classification Models for the Prediction of Type 2 Diabetes Mellitus: A Systematic Survey and Meta-Analysis Approach" International Journal of Environmental Research and Public Health 19, no. 21: 14280. https://doi.org/10.3390/ijerph192114280. 
  7. Huang S, Cai N, Pacheco PP, Narrandes S, Wang Y, Xu W. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics. Cancer Genomics Proteomics. 2018 JanFeb;15(1):41-51. doi: 10.21873/cgp.20063. PMID: 29275361; PMCID: PMC5822181. 
  8. Giamarelos, Nikolaos, Myron Papadimitrakis, Marios Stogiannos, Elias N. Zois, Nikolaos-Antonios I. Livanos, and Alex Alexandridis. 2023. "A Machine Learning Model Ensemble for Mixed Power Load Forecasting across Multiple Time Horizons" Sensors 23, no. 12: 5436. https://doi.org/10.3390/s23125436. 
  9. Naz H, Ahuja S. Deep learning approach for diabetes prediction using PIMA Indian dataset. J Diabetes Metab Disord. 2020 Apr 14;19(1):391-403. doi: 10.1007/s40200-020-00520-5. PMID: 32550190; PMCID: PMC7270283. 
  10. Kibria HB, Nahiduzzaman M, Goni MOF, Ahsan M, Haider J. An Ensemble Approach for the Prediction of Diabetes Mellitus Using a Soft Voting Classifier with an Explainable AI. Sensors (Basel). 2022 Sep 25;22(19):7268. doi: 10.3390/s22197268. PMID: 36236367; PMCID: PMC9571784. 
  11. Yavuz Ozalp, Ayse, Halil Akinci, and Mustafa Zeybek. 2023. "Comparative Analysis of Tree-Based Ensemble Learning Algorithms for Landslide Susceptibility Mapping: A Case Study in Rize, Turkey" Water 15, no. 14: 2661. https://doi.org/10.3390/w15142661. 
  12. Yang, W., Wei, Y., Wei, H. et al. Survey on Explainable AI: From Approaches, Limitations and Applications Aspects. Hum-Cent Intell Syst (2023). https://doi.org/10.1007/s44230-023-00038-y.
  13. Khanam, J.J.; Foo, S.Y. A Comparison of Machine Learning Algorithms for Diabetes Prediction. ICT Express 2021, 7, 432-439.  https://doi.org/10.1016/j.icte.2021.02.004
  14. Wang L, Wang X, Chen A, Jin X, Che H. Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model. Healthcare (Basel). 2020 Jul 31;8(3):247. doi: 10.3390/healthcare8030247. PMID: 32751894; PMCID: PMC7551910. 
  15. Gerke S, Minssen T, Cohen G. Ethical and legal challenges of artificial intelligence-driven healthcare. Artificial Intelligence in Healthcare. 2020:295-336. doi: 10.1016/B978-0-12-818438-7.00012-5. Epub 2020 Jun 26. PMCID: PMC7332220. 
  16. Allen, A.; Iqbal, Z.; Green-Saxena, A.; Hurtado, M.; Hoffman, J.; Mao, Q.; Das, R. Prediction of Diabetic Kidney Disease with Machine Learning Algorithms, upon the Initial Diagnosis of Type 2 Diabetes Mellitus. BMJ Open Diabetes Res. Care 2022, 10, e002560. 
  17. Saxena, R.; Sharma, S.K.; Gupta, M.; Sampada, G.C. A Novel Approach for Feature Selection and Classification of Diabetes Mellitus: Machine Learning Methods. Comput. Intell. Neurosci. 2022, 2022, 3820360. 
  18. Qin Y, Wu J, Xiao W, Wang K, Huang A, Liu B, Yu J, Li C, Yu F, Ren Z. Machine Learning Models for Data-Driven Prediction of Diabetes by Lifestyle Type. Int J Environ Res Public Health. 2022 Nov 15;19(22):15027. doi: 10.3390/ijerph192215027. PMID: 36429751; PMCID: PMC9690067. 
  19. Takakado M, Takata Y, Yamagata F, et al Simple and noninvasive screening method for diabetes based on myoinositol levels in urine samples collected at home BMJ Open Diabetes Research and Care 2020;8:e000984. doi: 10.1136/bmjdrc-2019-000984. 
  20. Padierna, Luis Carlos, Lauro Fabian Amador-Medina, Blanca Olivia Murillo-Ortiz, and Carlos Villasenor-Mora. 2020. "Classification method of peripheral arterial disease in patients with type 2 diabetes mellitus by infrared thermography and machine learning." https://doi.org/10.1016/j.infrared.2020.103531. 
  21. Aggarwal, S.; Pandey, K. Early Identification of PCOS with Commonly Known Diseases: Obesity, Diabetes, High Blood Pressure and Heart Disease Using Machine Learning Techniques. Expert Syst. Appl. 2023, 217, 119532. 
  22. Nguyen, Linh Phuong, Do Dinh Tung, Duong Thanh Nguyen, Hong Nhung Le, Toan Quoc Tran, Ta Van Binh, and Dung Thuy Nguyen Pham. 2023. "The Utilization of Machine Learning Algorithms for Assisting Physicians in the Diagnosis of Diabetes" Diagnostics 13, no. 12: 2087. https://doi.org/10.3390/diagnostics13122087. 
  23. Abdelhalim, A.; Traore, I. A New Method for Learning Decision Trees from Rules. In Proceedings of the 8th International Conference on Machine Learning and Applications, ICMLA 2009, Miami, FL, USA, 20-21 November 2022; pp. 693-698. 
  24. Karachaliou F, Simatos G, Simatou A. The Challenges in the Development of Diabetes Prevention and Care Models in Low-Income Settings. Front Endocrinol (Lausanne). 2020 Aug 13;11:518. doi: 10.3389/fendo.2020.00518. PMID: 32903709; PMCID: PMC7438784. 
  25. Duarte AA, Mohsin S, Golubnitschaja O. Diabetes care in figures: current pitfalls and future scenario. EPMA J. 2018 May 22;9(2):125-131. doi: 10.1007/s13167-018-0133-y. PMID: 29896313; PMCID: PMC5972141. 
  26. Diabetes Dataset (CDRG). "Diabetes Dataset, CDRG, 2016" Kaggle, https://www.kaggle.com/datasets/shahmeerahmedarain/diabetes-dataset.