Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Adversarial Normalization: I Can visualize Everything (ICE)open access

Authors
Choi, HoyoungJin, SeungwanHan, Kyungsik
Issue Date
Jun-2023
Publisher
IEEE
Citation
Conference on Computer Vision and Pattern Recognition, pp.1 - 10
Indexed
OTHER
Journal Title
Conference on Computer Vision and Pattern Recognition
Start Page
1
End Page
10
URI
https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/190404
Abstract
Vision transformers use [CLS] tokens to predict imageclasses. Their explainability visualization has been studied using relevant information from [CLS] tokens or focusing on attention scores during self-attention. Such visualization, however, is challenging because of the dependence of the structure of a vision transformer on skip connections and attention operators, the instability of nonlinearities in the learning process, and the limited reflection of self-attention scores on relevance. We argue that theoutput vectors for each input patch token in a vision transformer retain the image information of each patch location,which can facilitate the prediction of an image class. Inthis paper, we propose ICE (Adversarial Normalization: ICan visualize Everything), a novel method that enables amodel to directly predict a class for each patch in an image; thus, advancing the effective visualization of the explainability of a vision transformer. Our method distinguishes background from foreground regions by predictingbackground classes for patches that do not determine image classes. We used the DeiT-S model, the most repre-sentative model employed in studies, on the explainability visualization of vision transformers. On the ImageNetSegmentation dataset, ICE outperformed all explainability visualization methods for four cases depending on themodel size. We also conducted quantitative and qualitative analyses on the tasks of weakly-supervised object localization and unsupervised object discovery. On the CUB200-2011 and PASCALVOC07/12 datasets, ICE achievedcomparable performance to the state-of-the-art methods.We incorporated ICE into the encoder of DeiT-S and improved efficiency by 44.01% on the ImageNet dataset overthat achieved by the original DeiT-S model. We showedperformance on the accuracy and efficiency comparableto EViT, the state-of-the-art pruning model, demonstratingthe effectiveness of ICE. The code is available at https://github.com/Hanyang-HCC-Lab/ICE.
Files in This Item
Appears in
Collections
ETC > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Han, Kyungsik photo

Han, Kyungsik
COLLEGE OF ENGINEERING (DEPARTMENT OF INTELLIGENCE COMPUTING)
Read more

Altmetrics

Total Views & Downloads

BROWSE