Machine Learning for the Prediction of New-Onset Diabetes Mellitus during 5-Year Follow-up in Non-Diabetic Patients with Cardiovascular Risks

Choi, Byoung Geol; Rha, Seung-Woon; Kim, Suhng Wook; Kang, Jun Hyuk; ark, Ji Young; Noh, Yung-Kyun

doi:10.3349/ymj.2019.60.2.191

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Machine Learning for the Prediction of New-Onset Diabetes Mellitus during 5-Year Follow-up in Non-Diabetic Patients with Cardiovascular Risksopen access

Authors: Choi, Byoung Geol; Rha, Seung-Woon; Kim, Suhng Wook; Kang, Jun Hyuk; ark, Ji Young; Noh, Yung-Kyun

Issue Date: Feb-2019

Publisher: YONSEI UNIV COLL MEDICINE

Keywords: Type 2 diabetes mellitus; diabetes; machine learning; prediction; big data

Citation: YONSEI MEDICAL JOURNAL, v.60, no.2, pp 191 - 199

Pages: 9

Indexed: SCI
SCIE
SCOPUS

Journal Title: YONSEI MEDICAL JOURNAL

Volume: 60

Number: 2

Start Page: 191

End Page: 199

URI: https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/148293

DOI: 10.3349/ymj.2019.60.2.191

ISSN: 0513-5796
1976-2437

Abstract: Purpose: Many studies have proposed predictive models for type 2 diabetes mellitus (T2DM). However, these predictive models have several limitations, such as user convenience and reproducibility. The purpose of this study was to develop a T2DM predictive model using electronic medical records (EMRs) and machine learning and to compare the performance of this model with traditional statistical methods. Materials and Methods: In this study, a total of available 8454 patients who had no history of diabetes and were treated at the cardiovascular center of Korea University Guro Hospital were enrolled. All subjects completed 5 years of follow up. The prevalence of T2DM during follow up was 4.78% (404/8454). A total of 28 variables were extracted from the EMRs. In order to verify the cross-validation test according to the prediction model, logistic regression (LR), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and K-nearest neighbor (KNN) algorithm models were generated. The LR model was considered as the existing statistical analysis method. Results: All predictive models maintained a change within the standard deviation of area under the curve (AUC) <0.01 in the analysis after a 10-fold cross-validation test. Among all predictive models, the LR learning model showed the highest prediction performance, with an AUC of 0.78. However, compared to the LR model, the LDA, QDA, and KNN models did not show a statistically significant difference. Conclusion: We successfully developed and verified a T2DM prediction system using machine learning and an EMR database, and it predicted the 5-year occurrence of T2DM similarly to with a traditional prediction model. In further study, it is necessary to apply and verify the prediction model through clinical research.

Files in This Item

0069YMJ_ymj-60-191.pdf 875.86 kB

Appears in Collections: 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Noh, Yung Kyun photo

Noh, Yung Kyun: COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE