Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Encoder-Based Multimodal Ensemble Learning for High Compatibility and Accuracy in Phishing Website Detection

Authors
Ahn, JeminAkhavan, DorianJung, WoohwanKang, KyungtaeSon, Junggab
Issue Date
Sep-2025
Publisher
Springer Science and Business Media Deutschland GmbH
Keywords
Bidirectional Encoder Representations from Transformers (BERT); Encoder Model; Ensemble Learning; Multimodal; Phishing Website Detection
Citation
Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, v.629 LNICST, pp 347 - 365
Pages
19
Indexed
SCOPUS
Journal Title
Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST
Volume
629 LNICST
Start Page
347
End Page
365
URI
https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/126589
DOI
10.1007/978-3-031-94455-0_16
ISSN
1867-8211
1867-822X
Abstract
Phishing websites pose a significant threat to modern network security. In response, various detection methods have been developed, with deep learning-based approaches recently becoming dominant. As phishing tactics grow increasingly sophisticated, the use of diverse data types and advanced deep learning models is essential in contemporary detection methods. However, integrating various data types can cause compatibility issues, posing challenges for deep learning techniques. Furthermore, there is potential to enhance model accuracy through the careful selection of data types. To address these issues, this paper proposes a novel encoder-based multimodal ensemble learning approach to achieve high compatibility and accuracy in phishing website detection. Our method leverages two features: URLs and text content extracted from a single data source, HTML. HTML builds a crucial foundation, and these features are the most effective ones that illustrate every component of a website. Therefore, selecting these features from the single data source contributes to enhancing not only reliability but also compatibility of our model. Since both features are text-based and sequential, we employ Bidirectional Encoder Representations from Transformers (BERT) for its superior performance in handling such data. Comprehensive experiments demonstrate that our model achieves a classification accuracy of 98.9%, surpassing both our baseline models and existing detection methods.
Files in This Item
There are no files associated with this item.
Appears in
Collections
COLLEGE OF COMPUTING > DEPARTMENT OF ARTIFICIAL INTELLIGENCE > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kang, Kyung tae photo

Kang, Kyung tae
ERICA 소프트웨어융합대학 (DEPARTMENT OF ARTIFICIAL INTELLIGENCE)
Read more

Altmetrics

Total Views & Downloads

BROWSE