Encoder-Based Multimodal Ensemble Learning for High Compatibility and Accuracy in Phishing Website Detection
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Ahn, Jemin | - |
dc.contributor.author | Akhavan, Dorian | - |
dc.contributor.author | Jung, Woohwan | - |
dc.contributor.author | Kang, Kyungtae | - |
dc.contributor.author | Son, Junggab | - |
dc.date.accessioned | 2025-10-01T04:30:33Z | - |
dc.date.available | 2025-10-01T04:30:33Z | - |
dc.date.issued | 2025-09 | - |
dc.identifier.issn | 1867-8211 | - |
dc.identifier.issn | 1867-822X | - |
dc.identifier.uri | https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/126589 | - |
dc.description.abstract | Phishing websites pose a significant threat to modern network security. In response, various detection methods have been developed, with deep learning-based approaches recently becoming dominant. As phishing tactics grow increasingly sophisticated, the use of diverse data types and advanced deep learning models is essential in contemporary detection methods. However, integrating various data types can cause compatibility issues, posing challenges for deep learning techniques. Furthermore, there is potential to enhance model accuracy through the careful selection of data types. To address these issues, this paper proposes a novel encoder-based multimodal ensemble learning approach to achieve high compatibility and accuracy in phishing website detection. Our method leverages two features: URLs and text content extracted from a single data source, HTML. HTML builds a crucial foundation, and these features are the most effective ones that illustrate every component of a website. Therefore, selecting these features from the single data source contributes to enhancing not only reliability but also compatibility of our model. Since both features are text-based and sequential, we employ Bidirectional Encoder Representations from Transformers (BERT) for its superior performance in handling such data. Comprehensive experiments demonstrate that our model achieves a classification accuracy of 98.9%, surpassing both our baseline models and existing detection methods. | - |
dc.format.extent | 19 | - |
dc.language | 영어 | - |
dc.language.iso | ENG | - |
dc.publisher | Springer Science and Business Media Deutschland GmbH | - |
dc.title | Encoder-Based Multimodal Ensemble Learning for High Compatibility and Accuracy in Phishing Website Detection | - |
dc.type | Article | - |
dc.publisher.location | 독일 | - |
dc.identifier.doi | 10.1007/978-3-031-94455-0_16 | - |
dc.identifier.scopusid | 2-s2.0-105016211657 | - |
dc.identifier.bibliographicCitation | Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, v.629 LNICST, pp 347 - 365 | - |
dc.citation.title | Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST | - |
dc.citation.volume | 629 LNICST | - |
dc.citation.startPage | 347 | - |
dc.citation.endPage | 365 | - |
dc.type.docType | Conference Paper | - |
dc.description.isOpenAccess | N | - |
dc.description.journalRegisteredClass | scopus | - |
dc.subject.keywordAuthor | Bidirectional Encoder Representations from Transformers (BERT) | - |
dc.subject.keywordAuthor | Encoder Model | - |
dc.subject.keywordAuthor | Ensemble Learning | - |
dc.subject.keywordAuthor | Multimodal | - |
dc.subject.keywordAuthor | Phishing Website Detection | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
55 Hanyangdeahak-ro, Sangnok-gu, Ansan, Gyeonggi-do, 15588, Korea+82-31-400-4269 sweetbrain@hanyang.ac.kr
COPYRIGHT © 2021 HANYANG UNIVERSITY. ALL RIGHTS RESERVED.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.