Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

A Natural language processing based machine learning approach on building material eco-label databases wrangling

Full metadata record
DC Field Value Language
dc.contributor.authorPham, Duy Hoang-
dc.contributor.authorPark,Sojin-
dc.contributor.authorAhn,Yonghan-
dc.date.accessioned2024-11-05T07:00:18Z-
dc.date.available2024-11-05T07:00:18Z-
dc.date.issued2024-09-
dc.identifier.issn2093-761X-
dc.identifier.issn2093-7628-
dc.identifier.urihttps://scholarworks.bwise.kr/erica/handle/2021.sw.erica/120762-
dc.description.abstractIn recent years, databases promoting eco-labeled building materials (EBM) have received increasing attention. However, extracting valuable values from vast and disparate EBM databases remains a challenge due to inconsistencies in data formats, terminology, and organization. This research proposes a natural language processing (NLP) based machine learning (ML) approach to streamline the wrangling of EBM databases. An investigation of EBM databases, developing web-scraping and web-crawling to collect data from these databases, resulting in a refined dataset of 64,350 data points. The study then leverages NLP techniques and ML algorithms to standardize terminology, resolve inconsistencies, and integrate diverse EBM databases into a cohesive database. The Random Forest algorithm consistently emerged as a top-performing classifier, achieving high AUC scores in models such as “PBTs”, “Crate to Gate”, and the “UL GREENGUARD label”. For many ecolabels, the RF algorithm consistently delivered commendable performance, exemplified by its F1-scores for attributes like “PBTs” (94.19% in cross-validation, 94.72% in testing) and “C2Gate” (92.78% in cross-validation, 93.25% in testing). This structured representation facilitates efficient querying and analysis, enabling stakeholders to make informed decisions about EBM selection and utilization. By automating the labor-intensive process of EBM data wrangling, our research contributes to the advancement of sustainable construction practices and the broader goal of environmental stewardship in the built environment. © International Journal of Sustainable Building Technology and Urban Development.-
dc.format.extent14-
dc.language영어-
dc.language.isoENG-
dc.publisherSustainable Building Research Center-
dc.titleA Natural language processing based machine learning approach on building material eco-label databases wrangling-
dc.typeArticle-
dc.publisher.location영국-
dc.identifier.doi10.22712/susb.20240026-
dc.identifier.scopusid2-s2.0-85207032591-
dc.identifier.bibliographicCitationInternational Journal of Sustainable Building Technology and Urban Development, v.15, no.3, pp 367 - 380-
dc.citation.titleInternational Journal of Sustainable Building Technology and Urban Development-
dc.citation.volume15-
dc.citation.number3-
dc.citation.startPage367-
dc.citation.endPage380-
dc.type.docTypeArticle-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscopus-
dc.subject.keywordAuthordata wrangling-
dc.subject.keywordAuthorgreen building materials-
dc.subject.keywordAuthormachine learning-
dc.subject.keywordAuthornatural language processing-
dc.subject.keywordAuthortext classification-
dc.identifier.urlhttps://www.sbt-durabi.org/articles/article/v8AV/#Information-
Files in This Item
Go to Link
Appears in
Collections
COLLEGE OF ENGINEERING SCIENCES > MAJOR IN ARCHITECTURAL ENGINEERING > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Ahn, Yong Han photo

Ahn, Yong Han
ERICA 공학대학 (MAJOR IN ARCHITECTURAL ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE