Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

A Group Feature Ranking and Selection Method Based on Dimension Reduction Technique in High-Dimensional Dataopen access

Authors
Zubair, Iqbal MuhammadKim, Byunghoon
Issue Date
Nov-2022
Publisher
Institute of Electrical and Electronics Engineers Inc.
Keywords
Dimension reduction; feature extraction; group feature ranking; group feature selection; high dimensional data
Citation
IEEE Access, v.10, pp 125136 - 125147
Pages
12
Indexed
SCIE
SCOPUS
Journal Title
IEEE Access
Volume
10
Start Page
125136
End Page
125147
URI
https://scholarworks.bwise.kr/erica/handle/2021.sw.erica/111423
DOI
10.1109/ACCESS.2022.3225685
ISSN
2169-3536
Abstract
Group feature selection methods select the important group features by removing the irrelevant group features for reducing the complexity of the model. To the best of our knowledge, there are few group feature selection methods that provide the relative importance of each feature group. For this purpose, we developed a sparse group feature ranking method based on the dimension reduction technique for high dimensional data. Firstly, we applied relief to each group to remove irrelevant individual features. Secondly, we extract the new feature that represents each feature group. To this end, we reduce the multiple dimension of the group feature into a single dimension by applying Fisher linear discriminant analysis (FDA) for each feature group. At last, we estimate the relative importance of the extracted feature by applying random forest and selecting important features that have larger importance scores compared with other ones. In the end, machine-learning algorithms can be used to train and test the models. For the experiment, we compared the proposed with the supervised group lasso (SGL) method by using real-life high-dimensional datasets. Results show that the proposed method selects a few important group features just like the existing group feature selection method and provides the ranking and relative importance of all group features. SGL slightly performs better on logistic regression whereas the proposed method performs better on support vector machine, random forest, and gradient boosting in terms of classification performance metrics.
Files in This Item
Go to Link
Appears in
Collections
COLLEGE OF ENGINEERING SCIENCES > DEPARTMENT OF INDUSTRIAL & MANAGEMENT ENGINEERING > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kim, Byunghoon photo

Kim, Byunghoon
ERICA 공학대학 (DEPARTMENT OF INDUSTRIAL & MANAGEMENT ENGINEERING)
Read more

Altmetrics

Total Views & Downloads

BROWSE