Credible, resilient, and scalable detection of software plagiarism using authority histograms
- Authors
- Chae, Dong-Kyu; Ha, Jiwoon; Kim, Sang-Wook; Kang, BooJoong; Im, Eul Gyu; Park, SunJu
- Issue Date
- Mar-2016
- Publisher
- ELSEVIER
- Keywords
- Software plagiarism detection; Birthmark; Similarity analysis; Static analysis
- Citation
- KNOWLEDGE-BASED SYSTEMS, v.95, pp.114 - 124
- Indexed
- SCIE
SCOPUS
- Journal Title
- KNOWLEDGE-BASED SYSTEMS
- Volume
- 95
- Start Page
- 114
- End Page
- 124
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/155002
- DOI
- 10.1016/j.knosys.2015.12.009
- ISSN
- 0950-7051
- Abstract
- Software plagiarism has become a serious threat to the health of software industry. A software birthmark indicates unique characteristics of a program that can be used to analyze the similarity between two programs and provide proof of plagiarism. In this paper, we propose a novel birthmark, Authority Histograms (AH), which can satisfy three essential requirements for good birthmarks resiliency, credibility, and scat ability. Existing birthmarks fail to satisfy all of them simultaneously. AH reflects not only the frequency of APIs, but also their call orders, whereas previous birthmarks rarely consider them together. This property provides more accurate plagiarism detection, making our birthmark more resilient and credible than previously proposed birthmarks. By random walk with restart when generating AH, we make our proposal fully applicable to even large programs. Extensive experiments with a set of Windows applications verify that both the credibility and resiliency of AH exceed those of existing birthmarks; therefore AH provides improved accuracy in detecting plagiarism. Moreover, the construction and comparison phases of All are established within a reasonable time.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.