Machine learning-based diagnosis and risk factor analysis of cardiocerebrovascular disease based on KNHANES

Oh, Taeseob; Kim, Dongkyun; Lee, Siryeol; Won, Changwon; Kim, Sunyoung; Yang, Ji-soo; Yu, Junghwa; Kim, Byungsung; Lee, Joohyun

doi:10.1038/s41598-022-06333-1

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Machine learning-based diagnosis and risk factor analysis of cardiocerebrovascular disease based on KNHANES

Full metadata record

DC Field	Value	Language
dc.contributor.author	Oh, Taeseob	-
dc.contributor.author	Kim, Dongkyun	-
dc.contributor.author	Lee, Siryeol	-
dc.contributor.author	Won, Changwon	-
dc.contributor.author	Kim, Sunyoung	-
dc.contributor.author	Yang, Ji-soo	-
dc.contributor.author	Yu, Junghwa	-
dc.contributor.author	Kim, Byungsung	-
dc.contributor.author	Lee, Joohyun	-
dc.date.accessioned	2022-07-18T01:19:47Z	-
dc.date.available	2022-07-18T01:19:47Z	-
dc.date.issued	2022-02	-
dc.identifier.issn	2045-2322	-
dc.identifier.uri	https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/107969	-
dc.description.abstract	The prevalence of cardiocerebrovascular disease (CVD) is continuously increasing, and it is the leading cause of human death. Since it is difficult for physicians to screen thousands of people, high-accuracy and interpretable methods need to be presented. We developed four machine learning-based CVD classifiers (i.e., multi-layer perceptron, support vector machine, random forest, and light gradient boosting) based on the Korea National Health and Nutrition Examination Survey. We resampled and rebalanced KNHANES data using complex sampling weights such that the rebalanced dataset mimics a uniformly sampled dataset from overall population. For clear risk factor analysis, we removed multicollinearity and CVD-irrelevant variables using VIF-based filtering and the Boruta algorithm. We applied synthetic minority oversampling technique and random undersampling before ML training. We demonstrated that the proposed classifiers achieved excellent performance with AUCs over 0.853. Using Shapley value-based risk factor analysis, we identified that the most significant risk factors of CVD were age, sex, and the prevalence of hypertension. Additionally, we identified that age, hypertension, and BMI were positively correlated with CVD prevalence, while sex (female), alcohol consumption and, monthly income were negative. The results showed that the feature selection and the class balancing technique effectively improve the interpretability of models.	-
dc.format.extent	11	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	Nature Publishing Group	-
dc.title	Machine learning-based diagnosis and risk factor analysis of cardiocerebrovascular disease based on KNHANES	-
dc.type	Article	-
dc.publisher.location	영국	-
dc.identifier.doi	10.1038/s41598-022-06333-1	-
dc.identifier.scopusid	2-s2.0-85124445701	-
dc.identifier.wosid	000754021000007	-
dc.identifier.bibliographicCitation	Scientific Reports, v.12, no.1, pp 1 - 11	-
dc.citation.title	Scientific Reports	-
dc.citation.volume	12	-
dc.citation.number	1	-
dc.citation.startPage	1	-
dc.citation.endPage	11	-
dc.type.docType	Article	-
dc.description.isOpenAccess	Y	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Science & Technology - Other Topics	-
dc.relation.journalWebOfScienceCategory	Multidisciplinary Sciences	-
dc.subject.keywordPlus	NATIONAL-HEALTH	-
dc.subject.keywordPlus	OBESITY	-
dc.identifier.url	https://www.nature.com/articles/s41598-022-06333-1	-

Files in This Item: Go to Link

Appears in Collections: COLLEGE OF ENGINEERING SCIENCES > SCHOOL OF ELECTRICAL ENGINEERING > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Lee, Joo hyun photo

Lee, Joo hyun: ERICA 공학대학 (SCHOOL OF ELECTRICAL ENGINEERING)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

55 Hanyangdeahak-ro, Sangnok-gu, Ansan, Gyeonggi-do, 15588, Korea+82-31-400-4269 sweetbrain@hanyang.ac.kr

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE