Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kim, Ji-Hyun | - |
dc.date.available | 2018-05-10T15:17:03Z | - |
dc.date.created | 2018-04-17 | - |
dc.date.issued | 2009-09-01 | - |
dc.identifier.issn | 0167-9473 | - |
dc.identifier.uri | http://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/15773 | - |
dc.description.abstract | We consider the accuracy estimation of a classifier constructed on a given training sample. The naive resubstitution estimate is known to have a downward bias problem. The traditional approach to tackling this bias problem is cross-validation. The bootstrap is another way to bring down the high variability of cross-validation. But a direct comparison of the two estimators, cross-validation and bootstrap, is not fair because the latter estimator requires much heavier computation. We performed an empirical study to compare the .632+ bootstrap estimator with the repeated 10-fold cross-validation and the repeated one-third holdout estimator. All the estimators were set to require about the same amount of computation. In the simulation study, the repeated 10-fold cross-validation estimator was found to have better performance than the .632+ bootstrap estimator when the classifier is highly adaptive to the training sample. We have also found that the .632+ bootstrap estimator suffers from a bias problem for large samples as well as for small samples. (C) 2009 Elsevier B.V. All rights reserved. | - |
dc.publisher | ELSEVIER SCIENCE BV | - |
dc.relation.isPartOf | COMPUTATIONAL STATISTICS & DATA ANALYSIS | - |
dc.title | Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap | - |
dc.type | Article | - |
dc.identifier.doi | 10.1016/j.csda.2009.04.009 | - |
dc.type.rims | ART | - |
dc.identifier.bibliographicCitation | COMPUTATIONAL STATISTICS & DATA ANALYSIS, v.53, no.11, pp.3735 - 3745 | - |
dc.description.journalClass | 1 | - |
dc.identifier.wosid | 000267505600001 | - |
dc.identifier.scopusid | 2-s2.0-65749119811 | - |
dc.citation.endPage | 3745 | - |
dc.citation.number | 11 | - |
dc.citation.startPage | 3735 | - |
dc.citation.title | COMPUTATIONAL STATISTICS & DATA ANALYSIS | - |
dc.citation.volume | 53 | - |
dc.contributor.affiliatedAuthor | Kim, Ji-Hyun | - |
dc.type.docType | Article | - |
dc.description.journalRegisteredClass | scopus | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
Soongsil University Library 369 Sangdo-Ro, Dongjak-Gu, Seoul, Korea (06978)02-820-0733
COPYRIGHT ⓒ SOONGSIL UNIVERSITY, ALL RIGHTS RESERVED.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.