Classifying Malicious Documents on the Basis of Plain-Text Features: Problem, Solution, and Experiences

Hong, Jiwon; Jeong, Dongho; Kim, Sang-Wook

doi:10.3390/app12084088

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Classifying Malicious Documents on the Basis of Plain-Text Features: Problem, Solution, and Experiencesopen access

Authors: Hong, Jiwon; Jeong, Dongho; Kim, Sang-Wook

Issue Date: Apr-2022

Publisher: MDPI

Keywords: malware; malicious document; classification; text analysis

Citation: APPLIED SCIENCES-BASEL, v.12, no.8, pp.1 - 13

Indexed: SCIE
SCOPUS

Journal Title: APPLIED SCIENCES-BASEL

Volume: 12

Number: 8

Start Page: 1

End Page: 13

URI: https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/138950

DOI: 10.3390/app12084088

ISSN: 2076-3417

Abstract: Cyberattacks widely occur by using malicious documents. A malicious document is an electronic document containing malicious codes along with some plain-text data that is human-readable. In this paper, we propose a novel framework that takes advantage of such plaintext data to determine whether a given document is malicious. We extracted plaintext features from the corpus of electronic documents and utilized them to train a classification model for detecting malicious documents. Our extensive experimental results with different combinations of three well-known vectorization strategies and three popular classification methods on five types of electronic documents demonstrate that our framework provides high prediction accuracy in detecting malicious documents.

Files in This Item

applsci-12-04088.pdf 369.7 kB

Appears in Collections: 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Kim, Sang-Wook photo

Kim, Sang-Wook: COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :5,969,577; Today View :7,261

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE