Detection of malicious URLs based on word vector representation and ngram

Quan Tran Hai; Hwang, Seong Oun

Detailed Information

Cited 4 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Detection of malicious URLs based on word vector representation and ngram

Full metadata record

DC Field	Value	Language
dc.contributor.author	Quan Tran Hai	-
dc.contributor.author	Hwang, Seong Oun	-
dc.date.available	2020-10-20T06:45:05Z	-
dc.date.created	2020-06-10	-
dc.date.issued	2018-12	-
dc.identifier.issn	1064-1246	-
dc.identifier.uri	https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/78612	-
dc.description.abstract	Most Intrusion Detection Systems (IDS) nowadays are signature-based. They are very useful and accurate for detecting known attacks. However, they are inefficient with unknown attacks. Moreover, most of cyber attacks start with malicious URLs. Attackers try to trick users into clicking on malicious URLs. This gives attackers an easy way to launch attacks. To defend against this, companies and organizations use IDS/IPS to detect malicous URLs using blacklist of URLs. This is very efficient with known malicious URLs, but useless with unknown malicious URLs. To overcome this problem, a number of malicious Web site detection systems have been proposed. One of the most promising methods is to apply machine learning detection techniques. In this paper, we present a new lexical approach to classify URLs by using machine learning techniques which patternize the URLs. Our approach is based on natural language processing features which use word vector representation and ngram models on the blacklist word as the main features. Using this technique can help classifier distinguish benign URLs from malicious ones. Our experimentation shows that our approach can achieve a high degree of accuracy at 97.1% in the case of SVM. Moreover, it can maintain a high level of robustness with 0.97 precision and 0.93 recall scores.	-
dc.language	영어	-
dc.language.iso	en	-
dc.publisher	IOS PRESS	-
dc.relation.isPartOf	JOURNAL OF INTELLIGENT & FUZZY SYSTEMS	-
dc.title	Detection of malicious URLs based on word vector representation and ngram	-
dc.type	Article	-
dc.type.rims	ART	-
dc.description.journalClass	1	-
dc.identifier.wosid	000459214900010	-
dc.identifier.doi	10.3233/JIFS-169831	-
dc.identifier.bibliographicCitation	JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, v.35, no.6, pp.5889 - 5900	-
dc.description.isOpenAccess	N	-
dc.citation.endPage	5900	-
dc.citation.startPage	5889	-
dc.citation.title	JOURNAL OF INTELLIGENT & FUZZY SYSTEMS	-
dc.citation.volume	35	-
dc.citation.number	6	-
dc.contributor.affiliatedAuthor	Hwang, Seong Oun	-
dc.type.docType	Article; Proceedings Paper	-
dc.subject.keywordAuthor	Machine learning	-
dc.subject.keywordAuthor	cyber Security	-
dc.subject.keywordAuthor	URL classification	-
dc.subject.keywordAuthor	malicious URL	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalWebOfScienceCategory	Computer Science, Artificial Intelligence	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-

Files in This Item: There are no files associated with this item.

Appears in Collections: IT융합대학 > 컴퓨터공학과 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Hwang, Seong Oun photo

Hwang, Seong Oun: College of IT Convergence (컴퓨터공학부(컴퓨터공학전공))

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :4,235,480; Today View :7,462

RSS_1.0 RSS_2.0 ATOM_1.0

1342, Seongnam-daero, Sujeong-gu, Seongnam-si, Gyeonggi-do, Republic of Korea(13120)031-750-5114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE