Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Developing an automated framework for eco-label information categorization using web crawling and Natural Language Processing techniques

Full metadata record
DC Field Value Language
dc.contributor.authorNguyen, Ho Anh Thu-
dc.contributor.authorPham, Duy Hoang-
dc.contributor.authorKim, Byeol-
dc.contributor.authorAhn, Yonghan-
dc.contributor.authorKwon, Nahyun-
dc.date.accessioned2025-05-16T08:00:35Z-
dc.date.available2025-05-16T08:00:35Z-
dc.date.issued2025-07-
dc.identifier.issn0957-4174-
dc.identifier.issn1873-6793-
dc.identifier.urihttps://scholarworks.bwise.kr/erica/handle/2021.sw.erica/125248-
dc.description.abstractEco-labels are extensively employed to assess the environmental performance of building materials. However, their management is often fragmented across disparate online databases with inconsistent data structures, presenting significant challenges for efficient information acquisition and management. This study explores the application of web crawling techniques, Natural Language Processing (NLP), and machine learning (ML) models to collect and categorize eco-label information, with the objective of advancing the automation of information management processes. The results demonstrate that the categorization models exhibit high performance, achieving F1-scores exceeding 0.95 on the test set and at least 0.76 when validating datasets incorporating temporally updated information. However, the limited availability of data for certain eco-labels, such as Forest Stewardship Council certification and Green Screen, substantially degrades model performance with updated data. Notably, traditional ML models leveraging manual feature engineering outperform deep learning models with automatic feature extraction when applied to web-crawled data. Furthermore, the TF-IDF feature extraction technique surpasses other n-gram-based approaches, with model performance declining as n-gram length increases. This study establishes a systematic framework that informs the selection of reliable data sources, feature engineering strategies, and ML algorithms for integrating web crawling, thereby enhancing the automation of eco-label information management.-
dc.format.extent24-
dc.language영어-
dc.language.isoENG-
dc.publisherPERGAMON-ELSEVIER SCIENCE LTD-
dc.titleDeveloping an automated framework for eco-label information categorization using web crawling and Natural Language Processing techniques-
dc.typeArticle-
dc.publisher.location영국-
dc.identifier.doi10.1016/j.eswa.2025.127688-
dc.identifier.scopusid2-s2.0-105003373949-
dc.identifier.wosid001481749600001-
dc.identifier.bibliographicCitationEXPERT SYSTEMS WITH APPLICATIONS, v.282, pp 1 - 24-
dc.citation.titleEXPERT SYSTEMS WITH APPLICATIONS-
dc.citation.volume282-
dc.citation.startPage1-
dc.citation.endPage24-
dc.type.docTypeArticle-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalResearchAreaOperations Research & Management Science-
dc.relation.journalWebOfScienceCategoryComputer Science, Artificial Intelligence-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.relation.journalWebOfScienceCategoryOperations Research & Management Science-
dc.subject.keywordPlusBUILDING-MATERIALS-
dc.subject.keywordPlusSOCIAL MEDIA-
dc.subject.keywordPlusBIM-
dc.subject.keywordPlusCLASSIFICATION-
dc.subject.keywordPlusINTEGRATION-
dc.subject.keywordPlusMANAGEMENT-
dc.subject.keywordPlusENERGY-
dc.subject.keywordAuthorGreen building material-
dc.subject.keywordAuthorEco-label-
dc.subject.keywordAuthorInformation management-
dc.subject.keywordAuthorMachine learning-
dc.subject.keywordAuthorNatural language processing-
dc.identifier.urlhttps://www.sciencedirect.com/science/article/pii/S0957417425013107?pes=vor&utm_source=scopus&getft_integrator=scopus-
Files in This Item
Go to Link
Appears in
Collections
COLLEGE OF ENGINEERING SCIENCES > MAJOR IN ARCHITECTURAL ENGINEERING > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Ahn, Yong Han photo

Ahn, Yong Han
ERICA 공학대학 (MAJOR IN ARCHITECTURAL ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE