A software classification scheme using binary-level characteristics for efficient software filtering
- Authors
- Kim, Yesol; Cho, Seong-je; Han, Sangchul; You, Ilsun
- Issue Date
- Jan-2018
- Publisher
- Springer Verlag
- Keywords
- Software classification; Software filtering; Text mining; Random Forest; Portable executable
- Citation
- Soft Computing, v.22, no.2, pp 595 - 606
- Pages
- 12
- Journal Title
- Soft Computing
- Volume
- 22
- Number
- 2
- Start Page
- 595
- End Page
- 606
- URI
- https://scholarworks.bwise.kr/sch/handle/2021.sw.sch/6314
- DOI
- 10.1007/s00500-016-2357-x
- ISSN
- 1432-7643
1433-7479
- Abstract
- Software filtering systems can be employed to detect and filter out pirated or counterfeit software on the Web sites and peer-to-peer networks. They determine whether a suspicious program is legal or not by comparing it with original programs in a database or in the market. To identify pirated or counterfeit software, software filtering systems need to measure software similarity when comparing a suspicious program with original ones. In this case, the comparison overhead might be very high because the suspicious program is compared with all programs in the database or market in the worst case. This paper proposes a software classification scheme for efficient software filtering systems. The scheme focuses specifically on the Windows portable executable files which have been prime targets for software pirates. The scheme extracts software characteristics from a suspicious program and classifies it into one of pre-defined categories quickly based on the characteristics. The suspicious program is compared only with the programs in the one of pre-defined categories in most cases; thus, the comparison overhead is reduced. We propose two classification methods. The first one extracts strings from GUI-related resources of a program and computes the relevance of the program to each category based on the pre-computed score of the strings. The second one extracts API call frequency from a program's execution codes and uses Random Forest technique to classify the program. Experimental results show that the proposed scheme can classify programs effectively and can reduce the comparison overhead significantly.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Engineering > Department of Information Security Engineering > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.