Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

BinDiff( NN): Learning Distributed Representation of Assembly for Robust Binary Diffing Against Semantic Differences

Full metadata record
DC Field Value Language
dc.contributor.authorUllah, Sami-
dc.contributor.authorOh, Heekuck-
dc.date.accessioned2023-05-03T09:48:15Z-
dc.date.available2023-05-03T09:48:15Z-
dc.date.issued2022-09-
dc.identifier.issn0098-5589-
dc.identifier.issn1939-3520-
dc.identifier.urihttps://scholarworks.bwise.kr/erica/handle/2021.sw.erica/112784-
dc.description.abstractBinary diffing is a process to discover the differences and similarities in functionality between two binary programs. Previous research on binary diffing approaches it as a function matching problem to formulate an initial 1:1 mapping between functions, and later a sequence matching ratio is computed to classify two functions being an exact match, a partial match or no-match. The accuracy of existing techniques is best only when detecting exact matches and they are not efficient in detecting partially changed functions; especially those with minor patches. These drawbacks are due to two major challenges (i) In the 1:1 mapping phase, using a strict policy to match function features (ii) In the classification phase, considering an assembly snippet as a normal text, and using sequence matching for similarity comparison. Instruction has a unique structure i.e. mnemonics and registers have a specific position in instruction and also have a semantic relationship, which makes assembly code different from general text. Sequence matching performs best for general text but it fails to detect structural and semantic changes at an instruction level thus, its use for classification produces many false results. In this research, we have addressed the aforementioned underlying challenges by proposing a two-fold solution. For the 1:1 mapping phase, we have proposed computationally inexpensive features, which are compared with distance-based selection criteria to map similar functions and filter unmatched functions. For the classification phase, we have proposed a Siamese binary-classification neural network where each branch is an attention-based distributed learning embedding neural network - that learn the semantic similarity among assembly instructions, learn to highlight the changes at an instruction level and a final stage fully connected layer learn to accurately classify two 1:1 mapped function either an exact or a partial match. We have used x86 kernel binaries for training and achieved similar to 99% classification accuracy; which is higher than existing binary diffing techniques and tools.-
dc.format.extent25-
dc.language영어-
dc.language.isoENG-
dc.publisherInstitute of Electrical and Electronics Engineers-
dc.titleBinDiff( NN): Learning Distributed Representation of Assembly for Robust Binary Diffing Against Semantic Differences-
dc.typeArticle-
dc.publisher.location미국-
dc.identifier.doi10.1109/TSE.2021.3093926-
dc.identifier.scopusid2-s2.0-85111022125-
dc.identifier.wosid000854591500014-
dc.identifier.bibliographicCitationIEEE Transactions on Software Engineering, v.48, no.9, pp 3442 - 3466-
dc.citation.titleIEEE Transactions on Software Engineering-
dc.citation.volume48-
dc.citation.number9-
dc.citation.startPage3442-
dc.citation.endPage3466-
dc.type.docTypeArticle-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalWebOfScienceCategoryComputer Science, Software Engineering-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.subject.keywordAuthorAsm2Vec-
dc.subject.keywordAuthorattention network-
dc.subject.keywordAuthorbinary diffing-
dc.subject.keywordAuthorexact match-
dc.subject.keywordAuthorInst2vec-
dc.subject.keywordAuthorpartial match-
dc.subject.keywordAuthorsiamese neural network-
dc.identifier.urlhttps://ieeexplore.ieee.org/document/9470904-
Files in This Item
Go to Link
Appears in
Collections
COLLEGE OF COMPUTING > ERICA 컴퓨터학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Oh, Hee kuck photo

Oh, Hee kuck
ERICA 소프트웨어융합대학 (ERICA 컴퓨터학부)
Read more

Altmetrics

Total Views & Downloads

BROWSE