Enhanced DGA botnet domain detection and family classification via n-gram analysis and Hellinger distance

이연준

doi:10.1016/j.comnet.2025.111415

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Enhanced DGA botnet domain detection and family classification via n-gram analysis and Hellinger distance

Full metadata record

DC Field	Value	Language
dc.contributor.author	이연준	-
dc.date.accessioned	2025-07-24T07:00:20Z	-
dc.date.available	2025-07-24T07:00:20Z	-
dc.date.issued	2025-09	-
dc.identifier.issn	1389-1286	-
dc.identifier.issn	1872-7069	-
dc.identifier.uri	https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/126162	-
dc.description.abstract	Bot masters spread malware to create botnets and use Domain Generation Algorithms (DGAs) to evade blacklist-based detection methods with numerous generated domains, posing a significant threat to network security. Since detection alone cannot halt malware operations, classifying DGA domains into their respective botnet families is essential for enabling targeted countermeasures and addressing vulnerabilities in infected systems. However, most existing approaches focus primarily on distinguishing DGA domains from legitimate ones and face challenges when classifying domains from DGA families with similar character distributions, highlighting the need for improved techniques. In response, we expand the focus to DGA family classification and conduct in-depth analyses using eXplainable Artificial Intelligence (XAI) techniques to explore the impact of n-grams on classification performance. These analyses reveal that n-gram preprocessing and Hellinger Distance (HD)-based features derived from n-gram probability distributions can significantly enhance classification performance. Building on these insights, we propose an integrated framework with two components, an N-gram-based Multi-scale One-Dimensional Convolutional Neural Network model (N-MODCNN) and a machine learning (ML) classifier utilizing HD features, for detecting and classifying DGA domains. N-MODCNN detects DGA domains from n-gram preprocessed inputs, and detected domains are classified into their respective botnet families by a soft ensemble approach that integrates predictions from N-MODCNN and the ML classifier, enabling robust and accurate classification. Experiments on recent public datasets show that our framework achieves up to 99% detection and classification accuracy. For families with similar character distributions, it achieves F1-scores exceeding 90%, representing improvements of up to 72 percentage points over existing methods.	-
dc.format.extent	13	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	ELSEVIER	-
dc.title	Enhanced DGA botnet domain detection and family classification via n-gram analysis and Hellinger distance	-
dc.type	Article	-
dc.publisher.location	네델란드	-
dc.identifier.doi	10.1016/j.comnet.2025.111415	-
dc.identifier.scopusid	2-s2.0-105008512584	-
dc.identifier.wosid	001517726400003	-
dc.identifier.bibliographicCitation	COMPUTER NETWORKS, v.269, pp 1 - 13	-
dc.citation.title	COMPUTER NETWORKS	-
dc.citation.volume	269	-
dc.citation.startPage	1	-
dc.citation.endPage	13	-
dc.type.docType	Article	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalResearchArea	Telecommunications	-
dc.relation.journalWebOfScienceCategory	Computer Science, Hardware & Architecture	-
dc.relation.journalWebOfScienceCategory	Computer Science, Information Systems	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.relation.journalWebOfScienceCategory	Telecommunications	-
dc.subject.keywordAuthor	Botnet domain	-
dc.subject.keywordAuthor	Botnet family	-
dc.subject.keywordAuthor	Domain generation algorithm	-
dc.subject.keywordAuthor	Hellinger distance	-
dc.subject.keywordAuthor	N-gram	-
dc.identifier.url	https://www.sciencedirect.com/science/article/pii/S1389128625003822?pes=vor&utm_source=scopus&getft_integrator=scopus	-

Files in This Item: Go to Link

Appears in Collections: COLLEGE OF COMPUTING > ERICA 컴퓨터학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Lee, Yeon joon photo

Lee, Yeon joon: ERICA 소프트웨어융합대학 (ERICA 컴퓨터학부)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

55 Hanyangdeahak-ro, Sangnok-gu, Ansan, Gyeonggi-do, 15588, Korea+82-31-400-4269 sweetbrain@hanyang.ac.kr

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE