Detailed Information

Cited 0 time in webofscience Cited 6 time in scopus
Metadata Downloads

An efficient and effective wrapper based on paired t-test for learning naive Bayes classifiers from large-scale domains

Full metadata record
DC Field Value Language
dc.contributor.authorKim, C.-
dc.contributor.authorLi, H.-
dc.contributor.authorShin, S.-Y.-
dc.contributor.authorHwang, K.-B.-
dc.date.available2019-04-10T10:21:10Z-
dc.date.created2018-04-17-
dc.date.issued2013-
dc.identifier.issn1877-0509-
dc.identifier.urihttp://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/32812-
dc.description.abstractFeature selection is one of the crucial steps in supervised learning, which influences the entire subsequent classification (or regression) process. The approaches to this task can largely be divided into two categories: filter-based and wrapper-based methods. Generally, the latter produces better results than the former with regard to given learning methods, though it consumes more computational resources for searches over the feature subset space. In this paper, we propose an Efficient wRapper based on a Paired t-Test (ERPT) for choosing features from large-scale data consisting of thousands of variables, such as microarrays. Statistical tests are a reasonable option when the number of features is very large because they have more predictable behavior and can be more efficient than most search methods. The proposed method consists of two phases: decrement phase and increment phase. In the decrement phase, it selects strongly relevant features. In the increment phase, it adds weakly relevant features, given the previously selected features. Our method, combined with naive Bayes classifiers, has been tested in an extensive set of experiments on University of California Irvine (UCI) Machine Learning Repository data. The results showed that the performance of the proposed method is comparable to that of the backward search-based wrapper and superior to that of the forward search-based wrapper. Furthermore, it demonstrated much better performance than the forward search-based wrapper when applied to three microarray data sets, for which the backward search-based wrapper was impractical because of the computational burden involved. The proposed method has the following three merits: (1) it is applicable to data sets having thousands of variables, (2) it provides a theoretically sound and controllable criterion for thresholding features, and (3) it finds feature subsets for the maximizing of classification performance on sparse domains. © 2013 The Authors.-
dc.publisherElsevier B.V.-
dc.relation.isPartOfProcedia Computer Science-
dc.titleAn efficient and effective wrapper based on paired t-test for learning naive Bayes classifiers from large-scale domains-
dc.typeConference-
dc.identifier.doi10.1016/j.procs.2013.10.014-
dc.type.rimsCONF-
dc.identifier.bibliographicCitation4th International Conference on Computational Systems-Biology and Bioinformatics, CSBio 2013, v.23, pp.102 - 112-
dc.description.journalClass2-
dc.identifier.scopusid2-s2.0-84896928595-
dc.citation.conferenceDate2013-11-07-
dc.citation.conferencePlaceSeoul-
dc.citation.endPage112-
dc.citation.startPage102-
dc.citation.title4th International Conference on Computational Systems-Biology and Bioinformatics, CSBio 2013-
dc.citation.volume23-
dc.contributor.affiliatedAuthorHwang, K.-B.-
dc.type.docTypeConference Paper-
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Information Technology > School of Computer Science and Engineering > 2. Conference Papers

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Hwang, Kyu Baek photo

Hwang, Kyu Baek
College of Information Technology (School of Computer Science and Engineering)
Read more

Altmetrics

Total Views & Downloads

BROWSE