Trustworthy and interpretable stacking via TabNet-driven feature generation for breast cancer diagnosisopen access
- Authors
- Xia, Lin; Chung, Yoona; Suo, Liqiu; Hong, Jeongmin; Kim, Eunchan
- Issue Date
- May-2026
- Publisher
- AMER INST MATHEMATICAL SCIENCES-AIMS
- Keywords
- computational intelligence; trustworthy analytics; robustness; explainable artificial intelligence; stacking ensemble; TabNet feature generation; healthcare tabular data
- Citation
- ELECTRONIC RESEARCH ARCHIVE, v.34, no.6, pp 3804 - 3842
- Pages
- 39
- Indexed
- SCIE
SCOPUS
- Journal Title
- ELECTRONIC RESEARCH ARCHIVE
- Volume
- 34
- Number
- 6
- Start Page
- 3804
- End Page
- 3842
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/217604
- DOI
- 10.3934/era.2026172
- ISSN
- 2688-1594
2688-1594
- Abstract
- Trustworthy analytics for healthcare require models that are not only accurate but also interpretable and robust under distributional perturbations. In this paper, we propose an interpretable stacked ensemble framework that repurposes TabNet from an end-to-end classifier into an attention-guided feature generator for downstream learners. We constructed a dual-channel stacking architecture in which TabNet-derived embeddings and original tabular features were fed into heterogeneous gradient-boosted base learners (XGBoost and LightGBM) to enhance representation diversity, and were integrated by an interpretable logistic-regression meta-learner. For transparent and unbiased evaluation, we employed nested stratified cross-validation with fixed-budget hyperparameter tuning with systematic ablation studies. Experiments on the public Wisconsin Diagnostic Breast Cancer dataset showed that the proposed model achieves strong and stable performance (average accuracy 97.8% ± 1.0% under nested cross-validation) compared to a single TabNet baseline and conventional ensemble variants. Moreover, we assessed robustness under out-of-distribution-style covariate perturbations by injecting Gaussian noise at varying intensities, demonstrating that the stacking design mitigates the noise sensitivity of TabNet-derived representations and maintains a balanced sensitivity–specificity trade-off via adaptive thresholding. Overall, the proposed framework provides a reproducible template for combining deep tabular representation learning with explainable ensemble decision-making toward reliable data science applications in high-stakes domains.
- Files in This Item
-
Go to Link
- Appears in
Collections - 서울 공과대학 > 서울 정보시스템학과 > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.