Detection Method for Randomly Generated User IDs: Lift the Curse of Dimensionality

Ro, Inwoo; Kang, Boojoong; Seo, Choonghyun; Im, Eul Gyu

doi:10.1109/ACCESS.2022.3198687

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Detection Method for Randomly Generated User IDs: Lift the Curse of Dimensionality

Full metadata record

DC Field	Value	Language
dc.contributor.author	Ro, Inwoo	-
dc.contributor.author	Kang, Boojoong	-
dc.contributor.author	Seo, Choonghyun	-
dc.contributor.author	Im, Eul Gyu	-
dc.date.accessioned	2023-07-05T03:52:54Z	-
dc.date.available	2023-07-05T03:52:54Z	-
dc.date.created	2022-09-08	-
dc.date.issued	2022-08	-
dc.identifier.issn	2169-3536	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/186183	-
dc.description.abstract	Internet services are essential to our daily life in these days, and user accounts are usually required for downloading or browsing for multimedia contents from service providers such as Yahoo, Google, YouTube and so on. Attackers who perform malicious actions against these services use fake user accounts to hide their identity, or use them to continue malicious actions even after being caught by the service's detection system. Using a random string generation algorithm for user identification (ID) string is one of the common method to create and obtain a large number of fake user accounts. To detect IDs and to defend against such attacks, some researchers have proposed the models that detect randomly generated IDs. Among these detection models, the n-gram-based using term frequency-inverse document frequency model is regarded as a state-of-the-art model to detect randomly generated IDs, but n-gram-based approaches have the problem of the curse of dimensionality because the sparsity of feature vector increases exponentially with the increase of size n. As a result, the improvement of the detection accuracy is limited since size n cannot be increased. This paper proposes two methods to detect randomly generated IDs more accurately. The first is to avoid the curse of dimensionality with the compression of feature dimension size. The second is a technique to reduce false positives by using pattern matching and Bhattacharyya distance. We tested our method with about 3 million normal user IDs collected from the real portal service, 1 million IDs generated by a random string generation algorithm, and 8,541 IDs found after being used for malicious behavior in real portal services. The experimental results showed that the proposed method can improve detection accuracy as well as inference performance.	-
dc.language	영어	-
dc.language.iso	en	-
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC	-
dc.title	Detection Method for Randomly Generated User IDs: Lift the Curse of Dimensionality	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Im, Eul Gyu	-
dc.identifier.doi	10.1109/ACCESS.2022.3198687	-
dc.identifier.scopusid	2-s2.0-85137897247	-
dc.identifier.wosid	000844077200001	-
dc.identifier.bibliographicCitation	IEEE ACCESS, v.10, pp.86020 - 86028	-
dc.relation.isPartOf	IEEE ACCESS	-
dc.citation.title	IEEE ACCESS	-
dc.citation.volume	10	-
dc.citation.startPage	86020	-
dc.citation.endPage	86028	-
dc.type.rims	ART	-
dc.type.docType	Article	-
dc.description.journalClass	1	-
dc.description.isOpenAccess	Y	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalResearchArea	Telecommunications	-
dc.relation.journalWebOfScienceCategory	Computer Science, Information Systems	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.relation.journalWebOfScienceCategory	Telecommunications	-
dc.subject.keywordPlus	Computer crime	-
dc.subject.keywordPlus	Inverse problems	-
dc.subject.keywordPlus	Multimedia services	-
dc.subject.keywordPlus	Pattern matching	-
dc.subject.keywordPlus	Text processing	-
dc.subject.keywordPlus	Curse of dimensionality	-
dc.subject.keywordPlus	Detection accuracy	-
dc.subject.keywordPlus	Detection methods	-
dc.subject.keywordPlus	Generation algorithm	-
dc.subject.keywordPlus	Identity management systems	-
dc.subject.keywordPlus	N-grams	-
dc.subject.keywordPlus	Portal services	-
dc.subject.keywordPlus	Random string	-
dc.subject.keywordPlus	User ID	-
dc.subject.keywordPlus	Web-sites	-
dc.subject.keywordPlus	Authentication	-
dc.subject.keywordAuthor	Authentication	-
dc.subject.keywordAuthor	computer crime	-
dc.subject.keywordAuthor	identity management systems	-
dc.subject.keywordAuthor	web sites	-
dc.identifier.url	https://ieeexplore.ieee.org/document/9856640	-

Files in This Item

Detection_Method_for_Randomly_Generated_User_IDs_Lift_the_Curse_of_Dimensionality.pdf 5.62 MB

Appears in Collections: 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Im, Eul Gyu photo

Im, Eul Gyu: COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1366

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE