Original research article
front. Jennett.
Second calculation genomics
Volume 16-2025 |
doi:10.3389/fgene.2025.1451290
It is temporarily accepted
- 1 Al-Kudz University, Jerusalem, Palestine
- 2 Zefat Academic College, SAFED, Israel
- 3 Abdullah Gül University, Kayseri, Torkier
Diabetes has a major impact on millions of people around the world, leading to substantial morbidity, disability and mortality. Predicting diabetes-related complications from health records is important for early prevention and the development of effective treatment plans. This study introduces a new feature engineering approach to predict four different complications: diabetes, IE, retinopathy, chronic kidney disease, ischemic heart disease, and amputation. During classification model development, we utilize XGBoost feature selection methods and various monitored machine learning algorithms such as Random Forest, XGBoost, LogitBoost, Adaboost, and Decision Tree. These models were trained with synthetic electronic health records (EHRs) generated by an automated double by-product encoder. These EHRs represent nearly 1 million synthetic patients derived from a genuine cohort of 979,308 diabetes. The variables considered in the model were the age range with chronic diseases that occur during patient visits beginning with the onset of diabetes. Throughout the experiment, Xgboost and Random Forest achieved the best overall predictive performance. The final model tailored to each complication and trained using a functional engineering approach achieved accuracy between 69% and 77% and AUC between 77% and 84% using cross-validation. However, the partitioned verification approach has provided accuracy in between. 59% and 78%, while AUC is 66% to 85%. These findings imply that the performance of our method outweighs the performance of traditional bag sack approaches, highlighting the effectiveness of the approach in increasing model accuracy and robustness.
keyword:
Random Forest, xgboost, logitboost, adaboost, and learning algorithms such as decision tree diabetes, diabetes complications, machine learning
Received:
June 18, 2024.
Accepted:
January 31, 2025.
Copyright:
©2025
Voskergian, Yousef, Bakir-Gungor. this is,
Creative Commons Attribution License (CC by). If the original author or licensor is credited, it is permitted to be used, distributed or reproduced in other forums, and the original publications of this journal are cited in accordance with accepted academic practices. Any use, distribution, or reproduction that does not comply with these terms is not permitted.
* correspondence:
Daniel Voskergian, Al-Kuds University, Jerusalem, Palestine
Malik Yousef, Zefat Academic College, Safed, Israel
Disclaimer:
All claims expressed in this article are solely by the author and do not necessarily represent the claims of the affiliated organizations, or publishers, editors, or reviewers. Products that may be evaluated in this article or claims that may be made by its manufacturer are not warranted or endorsed by the publisher.