Developing and Evaluating a Classification Model for Construction Defect Control: A Text Mining and Ensemble Learning Approach

Jo, Inho; Han, SangHyeok; Hou, Lei; Moon, Sungkon; Kim, Jae-Jun

doi:10.1061/JMENEA.MEENG-6296

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Developing and Evaluating a Classification Model for Construction Defect Control: A Text Mining and Ensemble Learning Approach

Full metadata record

DC Field	Value	Language
dc.contributor.author	Jo, Inho	-
dc.contributor.author	Han, SangHyeok	-
dc.contributor.author	Hou, Lei	-
dc.contributor.author	Moon, Sungkon	-
dc.contributor.author	Kim, Jae-Jun	-
dc.date.accessioned	2026-03-26T07:30:44Z	-
dc.date.available	2026-03-26T07:30:44Z	-
dc.date.issued	2025-03	-
dc.identifier.issn	0742-597X	-
dc.identifier.issn	1943-5479	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/211632	-
dc.description.abstract	In the construction industry, customer satisfaction is of paramount importance, as it significantly impacts company success and reputation. In Korea’s competitive apartment market, customer satisfaction—particularly feedback on newly built apartments—is vital for construction companies, as it fosters growth and customer loyalty. To gain an understanding of the sentiments and patterns within this feedback, text mining can be utilized. This study aims to extract such insights from textual data on apartment building defect complaints, using text mining and ensemble learning to develop models with high prediction accuracy. It analyzes the accuracy of the Word2Vec and term frequency–inverse document frequency (TF-IDF) models, as well as the individual performance of different classification models, including naïve Bayes, decision trees, logistic regression, k-nearest neighbors, support vector machines (SVMs), and random forests. This analysis was conducted to validate the effectiveness of ensemble learning. Data were collected from a total of 230 apartment building projects in South Korea between 2018 and 2023, resulting in a data set of 101,387 data points, which underwent analysis to validate the model. The validation results consistently showed that TF-IDF outperforms Word2Vec, with the SVM model achieving the highest performance, attaining an average F1 score of 0.7439. Ensemble learning models demonstrated an improvement in accuracy of up to 34% over single models, reaching an average accuracy of 97.47% after the removal of human error. While this study acknowledges its limitations, which include potential biases in the data set, the impact of language evolution on model precision, and difficulties in classifying complex defects, the ensemble model demonstrated substantial improvements in defect classification accuracy and provided practical insights for defect management in construction. Moving forward, future work could explore integrating multidimensional data, utilizing speech-to-text technology, prioritizing defects by severity, and employing artificial intelligence for real-time defect prediction to further enhance defect management practices.	-
dc.format.extent	15	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	American Society of Civil Engineers	-
dc.title	Developing and Evaluating a Classification Model for Construction Defect Control: A Text Mining and Ensemble Learning Approach	-
dc.type	Article	-
dc.publisher.location	미국	-
dc.identifier.doi	10.1061/JMENEA.MEENG-6296	-
dc.identifier.scopusid	2-s2.0-85210860487	-
dc.identifier.wosid	001396402300001	-
dc.identifier.bibliographicCitation	Journal of Management in Engineering - ASCE, v.41, no.2, pp 1 - 15	-
dc.citation.title	Journal of Management in Engineering - ASCE	-
dc.citation.volume	41	-
dc.citation.number	2	-
dc.citation.startPage	1	-
dc.citation.endPage	15	-
dc.type.docType	Article	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalWebOfScienceCategory	Engineering, Industrial	-
dc.relation.journalWebOfScienceCategory	Engineering, Civil	-
dc.subject.keywordPlus	Apartment houses	-
dc.subject.keywordPlus	Contrastive Learning	-
dc.subject.keywordPlus	Customer satisfaction	-
dc.subject.keywordPlus	Decision trees	-
dc.subject.keywordPlus	Information management	-
dc.subject.keywordPlus	k-nearest neighbors	-
dc.subject.keywordPlus	Logistic regression	-
dc.subject.keywordPlus	Project management	-
dc.subject.keywordPlus	Speech enhancement	-
dc.subject.keywordPlus	Support vector machines	-
dc.subject.keywordAuthor	Customer satisfaction	-
dc.subject.keywordAuthor	Defect complaint classification	-
dc.subject.keywordAuthor	Ensemble learning	-
dc.subject.keywordAuthor	Practical implications	-
dc.subject.keywordAuthor	Prediction accuracy	-
dc.subject.keywordAuthor	Text mining	-
dc.identifier.url	https://ascelibrary.org/doi/10.1061/JMENEA.MEENG-6296	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 건축공학부 > 1. Journal Articles

Show simple item record

qrcode

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Altmetrics

Total Views & Downloads

BROWSE