Word Vector Representation and Ngram based Malicious URL Detection using Machine Learning
- Authors
- 황성운
- Issue Date
- 2-Feb-2017
- Publisher
- ICGHIT
- Citation
- ICGHIT 2018, v.Part 1, no.Part 1, pp.93 - 95
- Journal Title
- ICGHIT 2018
- Volume
- Part 1
- Number
- Part 1
- Start Page
- 93
- End Page
- 95
- URI
- https://scholarworks.bwise.kr/hongik/handle/2020.sw.hongik/6101
- Abstract
- Most Intrusion Detection Systems (IDS) nowadays are signature-based. They are very useful and accurate for detecting known attacks. However, they are inefficient with unknown attacks. One of promising methods to overcome this problem is to apply machine learning detection techniques. In this paper, we present a new lexical approach to classify URLs by using machine learning techniques which patternize the URLs. Our approach is based on word vector representation and ngram models on the blacklist word. Our experimentation shows that our approach can achieve a high degree of accuracy at 97.1% in the case of SVM(Support Vector Machine). Moreover, it can maintain a high level of robustness with 0.97 precision and 0.93 recall scores.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Science and Technology > Department of Computer and Information Communications Engineering > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.