Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Efficient Phishing Website Detection via HTML Tag Sequence Analysis Using Encoder Models

Authors
Ahn, JeminXiong, ZuobinCho, HomookKang, KyungtaeSon, Junggab
Issue Date
Aug-2025
Publisher
Institute of Electrical and Electronics Engineers Inc.
Keywords
Classification (of Information); Computer Crime; Html; Internet Of Things; Learning Algorithms; Learning Systems; Machine Learning; Network Security; Phishing; Websites; Defence Systems; Detection Methods; Html Tags; Machine-learning; Network Users; Phishing Websites; Security Measure; Security Mechanism; Sequence Analysis; Signal Encoding
Citation
Proceedings - International Conference on Computer Communications and Networks, ICCCN
Indexed
SCOPUS
Journal Title
Proceedings - International Conference on Computer Communications and Networks, ICCCN
URI
https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/126570
DOI
10.1109/ICCCN65249.2025.11133972
ISSN
1095-2055
Abstract
The rapid proliferation of Internet of Things (IoT) devices has led to a significant increase in the number of network users, prompting advancements in security mechanisms. Consequently, traditional attacks targeting specific vulnerabilities have become less effective due to these enhanced defense systems, leading attackers to increasingly adopt phishing strategies as a primary means of bypassing security measures. Among these, phishing websites have been increasing rapidly, exploiting the carelessness of countless users. In response, numerous phishing website detection methods have been investigated, with machine learning-based approaches emerging as a leading strategy. However, these machine learning-based classification methods require substantial computational resources, posing challenges for their direct application in the already widespread IoT environment. To address these challenges, we propose an efficient phishing website detection method based on HTML tag sequences, the core structural elements of websites, by leveraging encoder models known for their effectiveness in classifying sequential data. Our approach also incorporates a customized tokenizer and dictionary specifically tailored for HTML tags. Experiments conducted on publicly available datasets demonstrate that the proposed method achieves over 95% accuracy across key performance metrics. Furthermore, comparative analyses highlight several advantages of our method, including reduced model size and faster detection times compared to existing approaches.
Files in This Item
There are no files associated with this item.
Appears in
Collections
COLLEGE OF COMPUTING > DEPARTMENT OF ARTIFICIAL INTELLIGENCE > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kang, Kyung tae photo

Kang, Kyung tae
ERICA 소프트웨어융합대학 (DEPARTMENT OF ARTIFICIAL INTELLIGENCE)
Read more

Altmetrics

Total Views & Downloads

BROWSE