Dense but Efficient VideoQA for Intricate Compositional Reasoning

Lee, Jihyeon; Kang, Wooyoung; Kim, Eun Sol

doi:10.1109/WACV56688.2023.00117

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Dense but Efficient VideoQA for Intricate Compositional Reasoning

Full metadata record

DC Field	Value	Language
dc.contributor.author	Lee, Jihyeon	-
dc.contributor.author	Kang, Wooyoung	-
dc.contributor.author	Kim, Eun Sol	-
dc.date.accessioned	2023-03-13T07:21:15Z	-
dc.date.available	2023-03-13T07:21:15Z	-
dc.date.created	2023-03-08	-
dc.date.issued	2023-01	-
dc.identifier.issn	0000-0000	-
dc.identifier.uri	https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/182539	-
dc.description.abstract	It is well known that most of the conventional video question answering (VideoQA) datasets consist of easy questions requiring simple reasoning processes. However, long videos inevitably contain complex and compositional semantic structures along with the spatio-temporal axis, which requires a model to understand the compositional structures inherent in the videos. In this paper, we suggest a new compositional VideoQA method based on transformer architecture with a deformable attention mechanism to address the complex VideoQA tasks. The deformable attentions are introduced to sample a subset of informative visual features from the dense visual feature map to cover a temporally long range of frames efficiently. Furthermore, the dependency structure within the complex question sentences is also combined with the language embeddings to readily understand the relations among question words. Extensive experiments and ablation studies show that the suggested dense but efficient model outperforms other baselines.	-
dc.language	영어	-
dc.language.iso	en	-
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	-
dc.title	Dense but Efficient VideoQA for Intricate Compositional Reasoning	-
dc.type	Article	-
dc.contributor.affiliatedAuthor	Kim, Eun Sol	-
dc.identifier.doi	10.1109/WACV56688.2023.00117	-
dc.identifier.scopusid	2-s2.0-85149030587	-
dc.identifier.wosid	000971500201020	-
dc.identifier.bibliographicCitation	Proceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023, pp.1114 - 1123	-
dc.relation.isPartOf	Proceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023	-
dc.citation.title	Proceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023	-
dc.citation.startPage	1114	-
dc.citation.endPage	1123	-
dc.type.rims	ART	-
dc.type.docType	Proceedings Paper	-
dc.description.journalClass	1	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalResearchArea	Imaging Science & Photographic Technology	-
dc.relation.journalWebOfScienceCategory	Computer Science, Artificial Intelligence	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.relation.journalWebOfScienceCategory	Imaging Science & Photographic Technology	-
dc.subject.keywordPlus	Computer vision	-
dc.subject.keywordPlus	Action recognition	-
dc.subject.keywordPlus	Algorithm: video recognition and understanding (tracking, action recognition, etc.	-
dc.subject.keywordPlus	Compositional reasoning	-
dc.subject.keywordPlus	Question Answering	-
dc.subject.keywordPlus	Simple++	-
dc.subject.keywordPlus	Video recognition	-
dc.subject.keywordPlus	Video understanding	-
dc.subject.keywordPlus	Vision + language and/or other modality	-
dc.subject.keywordPlus	Visual feature	-
dc.subject.keywordPlus	Semantics	-
dc.subject.keywordAuthor	Algorithms: Video recognition and understanding (tracking, action recognition, etc.)	-
dc.subject.keywordAuthor	Vision + language and/or other modalities	-
dc.identifier.url	https://ieeexplore.ieee.org/document/10030999	-

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Kim, Eun Sol photo

Kim, Eun Sol: COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :6,120,262; Today View :3,104

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE