Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Empirical feature learning in application-based samples: A case studyopen access

Authors
Nguyen-Vu, LongJung, Souhwan
Issue Date
Oct-2022
Publisher
ELSEVIER
Keywords
Feature learning; Malware detection; Model selection; Neural network; Mobile security
Citation
JOURNAL OF COMPUTATIONAL SCIENCE, v.64
Journal Title
JOURNAL OF COMPUTATIONAL SCIENCE
Volume
64
URI
http://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/43336
DOI
10.1016/j.jocs.2022.101839
ISSN
1877-7503
Abstract
In machine learning, feature selection is the intrinsic factor that contributes to the success of a model. Due to the rise of deep learning in recent years, this process sometimes can be overlooked. The reason is because neural networks enable automatic feature extraction - the ability to select relevant features from the feature space without manual intervention. While this is a powerful technique when the data samples are images, it is not straight-forward in application-based samples. In this work, we explore the use of machine learning in Android application classification. We start with a research question: "Given a classification problem on the same dataset, why do several proposed models achieve considerably good performance, despite the fact that they use different training features?". We hypothesize two reasons for this phenomenon: (1) the models overfit the dataset in question, and (2) the features are non-i.i.d (Independent and Identically Distributed). We confirm these cases by reviewing previous studies on the same datasets. By analyzing the mapping between components in Android, we conclude that the strong correlations among them allow machine learning models to learn efficiently with just the subset of those features. Experiments are conducted to realize that a feedforward DNN or CNN, when provided with sufficient features, can generalize as well as complicated ones. The findings show that it is possible to train application classifiers on less features and simpler network architectures with inconsiderable performance degradation.
Files in This Item
Go to Link
Appears in
Collections
ETC > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Jung, Sou hwan photo

Jung, Sou hwan
College of Information Technology (Department of IT Convergence)
Read more

Altmetrics

Total Views & Downloads

BROWSE