Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Dark Side of the Web: Dark Web Classification Based on TextCNN and Topic Modeling Weightopen access

Authors
Shin, Gun-YoonJang, YounghoanKim, Dong-WookPark, SungjinPark, A-RanKim, YounghwanHan, Myung-Mook
Issue Date
Jan-2024
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Keywords
Dark Web; Feature extraction; Data models; Analytical models; Text categorization; Graph neural networks; Classification algorithms; Dark web; dark web analysis; text classification; topic modeling; model explanation
Citation
IEEE ACCESS, v.12, pp 36361 - 36371
Pages
11
Journal Title
IEEE ACCESS
Volume
12
Start Page
36361
End Page
36371
URI
https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/90821
DOI
10.1109/ACCESS.2023.3347737
ISSN
2169-3536
Abstract
The Dark Web is an internet domain that ensures user anonymity and has increasingly become a focal point for illegal activities and a repository for information on cyberattacks owing to the challenges in tracking its users. This study examined the classification of the Dark Web in relation to these cyber threats. We processed Dark Web texts to extract vector types suitable for machine learning classification. Traditional methods utilizing the entirety of Dark Web texts to generate features result in vectors including all words found on the Dark Web. However, this approach incorporates extraneous information in the vectors, diminishing learning effectiveness and extending processing duration. The research aimed to optimize the classification process by selectively focusing on keywords within each class, thereby curtailing word vector dimensions. This optimization was facilitated by leveraging the anonymity characteristic of the Dark Web and employing topic-modeling-based weight generation. These methods enabled the creation of word vectors with a constrained feature set, enhancing the distinction of Dark Web classes. To further improve classification performance, we integrated TextCNN with topic modeling weights. For validation, we employed two datasets and compared the performance of the model with other text classification algorithms, where the proposed model demonstrated superior effectiveness in Dark Web classification.
Files in This Item
There are no files associated with this item.
Appears in
Collections
ETC > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Han, Myung Mook photo

Han, Myung Mook
IT (Department of Software)
Read more

Altmetrics

Total Views & Downloads

BROWSE