Generative Bias for Robust Visual Question Answering
- Authors
- Cho, Jae Won; Kim, Dong-Jin; Ryu, Hyeonggon; Kweon, In So
- Issue Date
- Aug-2023
- Publisher
- IEEE COMPUTER SOC
- Keywords
- language; reasoning; Vision
- Citation
- 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), pp 11681 - 11690
- Pages
- 10
- Indexed
- SCIE
SCOPUS
- Journal Title
- 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)
- Start Page
- 11681
- End Page
- 11690
- URI
- https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/190374
- DOI
- 10.1109/CVPR52729.2023.01124
- ISSN
- 1063-6919
2575-7075
- Abstract
- The task of Visual Question Answering (VQA) is knownto be plagued by the issue of VQA models exploiting biases within the dataset to make its final prediction. Variousprevious ensemble based debiasing methods have been proposed where an additional model is purposefully trained tobe biased in order to train a robust target model. However, these methods compute the bias for a model simplyfrom the label statistics of the training data or from singlemodal branches. In this work, in order to better learn thebias a target VQA model suffers from, we propose a generative method to train the bias model directly from the targetmodel, called GenB. In particular, GenB employs a generative network to learn the bias in the target model througha combination of the adversarial objective and knowledgedistillation. We then debias our target model with GenB asa bias model, and show through extensive experiments theeffects of our method on various VQA bias datasets including VQA-CP2, VQA-CP1, GQA-OOD, and VQA-CE, andshow state-of-the-art results with the LXMERT architectureon VQA-CP2.
- Files in This Item
-
Go to Link
- Appears in
Collections - 서울 공과대학 > ETC > 1. Journal Articles

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.