Privacy-Preserving Phishing Detection in HTML Code Using Split Learning
- Authors
- Kim,Jungin; Kim,Yushin; Lee,Sejong; Cho,Sunghyun
- Issue Date
- Oct-2024
- Publisher
- KICS
- Keywords
- split learning; collaborative learning; distributed learning; large language model; transformer; phishing
- Citation
- 2024 International Conference on Information and Communication Technology Convergence (ICTC), pp 173 - 178
- Pages
- 6
- Indexed
- SCOPUS
- Journal Title
- 2024 International Conference on Information and Communication Technology Convergence (ICTC)
- Start Page
- 173
- End Page
- 178
- URI
- https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/120718
- Abstract
- Split learning is a distributed learning technique that enables multiple data owners to collaboratively train deep learning models without sharing their raw data, thereby preserving privacy and reducing computational burdens. This study investigates the application of split learning in the domain of phishing detection using HyperText Mark-up Language (HTML) code analysis. By integrating large language model (LLM) within a split learning framework, we aim to enhance the detection of phishing attempts while maintaining data privacy and optimizing computational resources. We conducted extensive experiments comparing the performance of the LLM split learning model with traditional centralized models, assessing scenarios with biased client sampling, varying client numbers, and pre-trained server models. The results indicate that the split learning model achieves comparable accuracy to centralized models, demonstrating its robustness and efficiency. Our findings underscore the potential of split learning in developing privacy-preserving and computationally efficient anti-phishing systems.
- Files in This Item
-
Go to Link
- Appears in
Collections - COLLEGE OF COMPUTING > ERICA 컴퓨터학부 > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.