Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Single-Modal Entropy based Active Learning for Visual Question Answering

Authors
Kim, Dong JinCho, Jae WonChoi, JinsooJung, YunjaeKweon, In So
Issue Date
Nov-2021
Publisher
British Machine Vision Association (BMVA)
Citation
British Machine Vision Conference, pp.1 - 15
Indexed
OTHER
Journal Title
British Machine Vision Conference
Start Page
1
End Page
15
URI
https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/192368
Abstract
Constructing a large-scale labeled dataset in the real world, especially for high-level tasks (eg, Visual Question Answering), can be expensive and time-consuming. In addition, with the ever-growing amounts of data and architecture complexity, Active Learning has become an important aspect of computer vision research. In this work, we address Active Learning in the multi-modal setting of Visual Question Answering (VQA). In light of the multi-modal inputs, image and question, we propose a novel method for effective sample acquisition through the use of ad hoc single-modal branches for each input to leverage its information. Our mutual information based sample acquisition strategy Single-Modal Entropic Measure (SMEM) in addition to our self-distillation technique enables the sample acquisitor to exploit all present modalities and find the most informative samples. Our novel idea is simple to implement, cost-efficient, and readily adaptable to other multi-modal tasks. We confirm our findings on various VQA datasets through state-of-the-art performance by comparing to existing Active Learning baselines.
Files in This Item
Go to Link
Appears in
Collections
ETC > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kim, Dong Jin photo

Kim, Dong Jin
COLLEGE OF ENGINEERING (DEPARTMENT OF INTELLIGENCE COMPUTING)
Read more

Altmetrics

Total Views & Downloads

BROWSE