Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Adversarial Normalization: I Can visualize Everything (ICE)

Full metadata record
DC Field Value Language
dc.contributor.authorChoi, Hoyoung-
dc.contributor.authorJin, Seungwan-
dc.contributor.authorHan, Kyungsik-
dc.date.accessioned2023-11-14T08:17:53Z-
dc.date.available2023-11-14T08:17:53Z-
dc.date.issued2023-06-
dc.identifier.issn1063-6919-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/192191-
dc.description.abstractVision transformers use [CLS] tokens to predict image classes. Their explainability visualization has been studied using relevant information from [CLS] tokens or focusing on attention scores during self-attention. Such visualization, however, is challenging because of the dependence of the structure of a vision transformer on skip connections and attention operators, the instability of non-linearities in the learning process, and the limited reflection of self-attention scores on relevance. We argue that the output vectors for each input patch token in a vision transformer retain the image information of each patch location, which can facilitate the prediction of an image class. In this paper, we propose ICE (Adversarial Normalization: I Can visualize Everything), a novel method that enables a model to directly predict a class for each patch in an image; thus, advancing the effective visualization of the explainability of a vision transformer. Our method distinguishes background from foreground regions by predicting background classes for patches that do not determine image classes. We used the DeiT-S model, the most representative model employed in studies, on the explainability visualization of vision transformers. On the ImageNet-Segmentation dataset, ICE outperformed all explainability visualization methods for four cases depending on the model size. We also conducted quantitative and qualitative analyses on the tasks of weakly-supervised object localization and unsupervised object discovery. On the CUB-200-2011 and PASCALVOC07/12 datasets, ICE achieved comparable performance to the state-of-the-art methods. We incorporated ICE into the encoder of DeiT-S and improved efficiency by 44.01% on the ImageNet dataset over that achieved by the original DeiT-S model. We showed performance on the accuracy and efficiency comparable to EViT, the state-of-the-art pruning model, demonstrating the effectiveness of ICE. The code is available at https://github.com/Hanyang-HCC-Lab/ICE.-
dc.format.extent10-
dc.language영어-
dc.language.isoENG-
dc.publisherIEEE Computer Society-
dc.titleAdversarial Normalization: I Can visualize Everything (ICE)-
dc.typeArticle-
dc.publisher.location미국-
dc.identifier.doi10.1109/CVPR52729.2023.01166-
dc.identifier.scopusid2-s2.0-85173910149-
dc.identifier.wosid001062522104042-
dc.identifier.bibliographicCitationProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, v.2023-June, pp 12115 - 12124-
dc.citation.titleProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition-
dc.citation.volume2023-June-
dc.citation.startPage12115-
dc.citation.endPage12124-
dc.type.docTypeProceedings Paper-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscie-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalWebOfScienceCategoryComputer Science, Artificial Intelligence-
dc.subject.keywordAuthorExplainable computer vision-
dc.identifier.urlhttps://ieeexplore.ieee.org/document/10203641-
Files in This Item
Go to Link
Appears in
Collections
ETC > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Han, Kyungsik photo

Han, Kyungsik
COLLEGE OF ENGINEERING (DEPARTMENT OF INTELLIGENCE COMPUTING)
Read more

Altmetrics

Total Views & Downloads

BROWSE