OCR Meets the DarkWeb: Identifying the Content Type Regarding Illegal and Cybercrime

Kim, Donghyun; Jeon, Seungho; Shin, Jiho; Seo, Jung Taek

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

OCR Meets the DarkWeb: Identifying the Content Type Regarding Illegal and Cybercrime

Full metadata record

DC Field	Value	Language
dc.contributor.author	Kim, Donghyun	-
dc.contributor.author	Jeon, Seungho	-
dc.contributor.author	Shin, Jiho	-
dc.contributor.author	Seo, Jung Taek	-
dc.date.accessioned	2024-06-04T06:30:25Z	-
dc.date.available	2024-06-04T06:30:25Z	-
dc.date.issued	2024-01	-
dc.identifier.issn	0302-9743	-
dc.identifier.issn	1611-3349	-
dc.identifier.uri	https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/91401	-
dc.description.abstract	The dark web provides features such as encryption and routing changes to ensure anonymity and make tracking difficult. Cybercrimes exploit the characteristics to gain revenue by distributing illegal and cybercrime content through the dark web and take a financial benefit as a business strategy. Illegal and cybercrime content includes drug and arms trafficking, counterfeit documents, malware, and the sale of personal information. A text crawling system in dark web has been developed and researched to counter illegal and cybercrime content distribution. However, because traditional text crawler in the dark web collects all text, identifying the exact data type can be difficult if dark web pages serve different types of illegal and cybercrime content. In this paper, we propose amethod of using the text embedded within images to accurately identify the types of illegal and cybercrime content on the dark web. We conducted the experiments with a combination of text and texts from both web page and images to accurately identify illegal and cybercrime content types. We collected keywords for the three types of illegal and cybercrime content. The distribution and types of illegal and cybercrime content were identified by calculating whether the collected keywords were included in dark web pages. Through experiments, we confirmed that using text embedded within images improves performance. Our proposed method accurately identified over 90% of dark web pages where drugs were distributed.	-
dc.format.extent	12	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	SPRINGER-VERLAG SINGAPORE PTE LTD	-
dc.title	OCR Meets the DarkWeb: Identifying the Content Type Regarding Illegal and Cybercrime	-
dc.type	Article	-
dc.identifier.wosid	001206151300016	-
dc.identifier.doi	10.1007/978-981-99-8024-6_16	-
dc.identifier.bibliographicCitation	INFORMATION SECURITY APPLICATIONS, WISA 2023, v.14402, pp 201 - 212	-
dc.description.isOpenAccess	N	-
dc.identifier.scopusid	2-s2.0-85182608939	-
dc.citation.endPage	212	-
dc.citation.startPage	201	-
dc.citation.title	INFORMATION SECURITY APPLICATIONS, WISA 2023	-
dc.citation.volume	14402	-
dc.type.docType	Proceedings Paper	-
dc.publisher.location	싱가폴	-
dc.subject.keywordAuthor	Dark Web	-
dc.subject.keywordAuthor	Crawler	-
dc.subject.keywordAuthor	Illegal and Cybercrime Content	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalWebOfScienceCategory	Computer Science, Software Engineering	-
dc.relation.journalWebOfScienceCategory	Computer Science, Theory & Methods	-
dc.description.journalRegisteredClass	scopus	-

Files in This Item: There are no files associated with this item.

Appears in Collections: ETC > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher SEO, JUNGTAEK photo

SEO, JUNGTAEK: College of IT Convergence (컴퓨터공학부(스마트보안전공))

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

1342, Seongnam-daero, Sujeong-gu, Seongnam-si, Gyeonggi-do, Republic of Korea(13120)031-750-5114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE