Classifying Malicious Documents on the Basis of Plain-Text Features: Problem, Solution, and Experiencesopen access
- Authors
- Hong, Jiwon; Jeong, Dongho; Kim, Sang-Wook
- Issue Date
- Apr-2022
- Publisher
- MDPI
- Keywords
- malware; malicious document; classification; text analysis
- Citation
- APPLIED SCIENCES-BASEL, v.12, no.8, pp.1 - 13
- Indexed
- SCIE
SCOPUS
- Journal Title
- APPLIED SCIENCES-BASEL
- Volume
- 12
- Number
- 8
- Start Page
- 1
- End Page
- 13
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/138950
- DOI
- 10.3390/app12084088
- ISSN
- 2076-3417
- Abstract
- Cyberattacks widely occur by using malicious documents. A malicious document is an electronic document containing malicious codes along with some plain-text data that is human-readable. In this paper, we propose a novel framework that takes advantage of such plaintext data to determine whether a given document is malicious. We extracted plaintext features from the corpus of electronic documents and utilized them to train a classification model for detecting malicious documents. Our extensive experimental results with different combinations of three well-known vectorization strategies and three popular classification methods on five types of electronic documents demonstrate that our framework provides high prediction accuracy in detecting malicious documents.
- Files in This Item
-
- Appears in
Collections - 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles
![qrcode](https://api.qrserver.com/v1/create-qr-code/?size=55x55&data=https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/138950)
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.