Adversarial Normalization: I Can visualize Everything (ICE)
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Choi, Hoyoung | - |
dc.contributor.author | Jin, Seungwan | - |
dc.contributor.author | Han, Kyungsik | - |
dc.date.accessioned | 2023-09-11T01:55:06Z | - |
dc.date.available | 2023-09-11T01:55:06Z | - |
dc.date.created | 2023-07-20 | - |
dc.date.issued | 2023-06 | - |
dc.identifier.uri | https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/190404 | - |
dc.description.abstract | Vision transformers use [CLS] tokens to predict imageclasses. Their explainability visualization has been studied using relevant information from [CLS] tokens or focusing on attention scores during self-attention. Such visualization, however, is challenging because of the dependence of the structure of a vision transformer on skip connections and attention operators, the instability of nonlinearities in the learning process, and the limited reflection of self-attention scores on relevance. We argue that theoutput vectors for each input patch token in a vision transformer retain the image information of each patch location,which can facilitate the prediction of an image class. Inthis paper, we propose ICE (Adversarial Normalization: ICan visualize Everything), a novel method that enables amodel to directly predict a class for each patch in an image; thus, advancing the effective visualization of the explainability of a vision transformer. Our method distinguishes background from foreground regions by predictingbackground classes for patches that do not determine image classes. We used the DeiT-S model, the most repre-sentative model employed in studies, on the explainability visualization of vision transformers. On the ImageNetSegmentation dataset, ICE outperformed all explainability visualization methods for four cases depending on themodel size. We also conducted quantitative and qualitative analyses on the tasks of weakly-supervised object localization and unsupervised object discovery. On the CUB200-2011 and PASCALVOC07/12 datasets, ICE achievedcomparable performance to the state-of-the-art methods.We incorporated ICE into the encoder of DeiT-S and improved efficiency by 44.01% on the ImageNet dataset overthat achieved by the original DeiT-S model. We showedperformance on the accuracy and efficiency comparableto EViT, the state-of-the-art pruning model, demonstratingthe effectiveness of ICE. The code is available at https://github.com/Hanyang-HCC-Lab/ICE. | - |
dc.language | 영어 | - |
dc.language.iso | en | - |
dc.publisher | IEEE | - |
dc.title | Adversarial Normalization: I Can visualize Everything (ICE) | - |
dc.type | Article | - |
dc.contributor.affiliatedAuthor | Han, Kyungsik | - |
dc.identifier.bibliographicCitation | Conference on Computer Vision and Pattern Recognition, pp.1 - 10 | - |
dc.relation.isPartOf | Conference on Computer Vision and Pattern Recognition | - |
dc.citation.title | Conference on Computer Vision and Pattern Recognition | - |
dc.citation.startPage | 1 | - |
dc.citation.endPage | 10 | - |
dc.type.rims | ART | - |
dc.type.docType | Proceeding | - |
dc.description.journalClass | 3 | - |
dc.description.isOpenAccess | Y | - |
dc.description.journalRegisteredClass | other | - |
dc.identifier.url | https://openaccess.thecvf.com/content/CVPR2023/papers/Choi_Adversarial_Normalization_I_Can_Visualize_Everything_ICE_CVPR_2023_paper.pdf | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365
COPYRIGHT © 2021 HANYANG UNIVERSITY.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.