Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

OCR Meets the DarkWeb: Identifying the Content Type Regarding Illegal and Cybercrime

Authors
Kim, DonghyunJeon, SeunghoShin, JihoSeo, Jung Taek
Issue Date
Jan-2024
Publisher
SPRINGER-VERLAG SINGAPORE PTE LTD
Keywords
Dark Web; Crawler; Illegal and Cybercrime Content
Citation
INFORMATION SECURITY APPLICATIONS, WISA 2023, v.14402, pp 201 - 212
Pages
12
Journal Title
INFORMATION SECURITY APPLICATIONS, WISA 2023
Volume
14402
Start Page
201
End Page
212
URI
https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/91401
DOI
10.1007/978-981-99-8024-6_16
ISSN
0302-9743
1611-3349
Abstract
The dark web provides features such as encryption and routing changes to ensure anonymity and make tracking difficult. Cybercrimes exploit the characteristics to gain revenue by distributing illegal and cybercrime content through the dark web and take a financial benefit as a business strategy. Illegal and cybercrime content includes drug and arms trafficking, counterfeit documents, malware, and the sale of personal information. A text crawling system in dark web has been developed and researched to counter illegal and cybercrime content distribution. However, because traditional text crawler in the dark web collects all text, identifying the exact data type can be difficult if dark web pages serve different types of illegal and cybercrime content. In this paper, we propose amethod of using the text embedded within images to accurately identify the types of illegal and cybercrime content on the dark web. We conducted the experiments with a combination of text and texts from both web page and images to accurately identify illegal and cybercrime content types. We collected keywords for the three types of illegal and cybercrime content. The distribution and types of illegal and cybercrime content were identified by calculating whether the collected keywords were included in dark web pages. Through experiments, we confirmed that using text embedded within images improves performance. Our proposed method accurately identified over 90% of dark web pages where drugs were distributed.
Files in This Item
There are no files associated with this item.
Appears in
Collections
ETC > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher SEO, JUNGTAEK photo

SEO, JUNGTAEK
College of IT Convergence (컴퓨터공학부(스마트보안전공))
Read more

Altmetrics

Total Views & Downloads

BROWSE