Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Dense but Efficient VideoQA for Intricate Compositional Reasoning

Full metadata record
DC Field Value Language
dc.contributor.authorLee, Jihyeon-
dc.contributor.authorKang, Wooyoung-
dc.contributor.authorKim, Eun Sol-
dc.date.accessioned2023-03-13T07:21:15Z-
dc.date.available2023-03-13T07:21:15Z-
dc.date.created2023-03-08-
dc.date.issued2023-01-
dc.identifier.issn0000-0000-
dc.identifier.urihttps://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/182539-
dc.description.abstractIt is well known that most of the conventional video question answering (VideoQA) datasets consist of easy questions requiring simple reasoning processes. However, long videos inevitably contain complex and compositional semantic structures along with the spatio-temporal axis, which requires a model to understand the compositional structures inherent in the videos. In this paper, we suggest a new compositional VideoQA method based on transformer architecture with a deformable attention mechanism to address the complex VideoQA tasks. The deformable attentions are introduced to sample a subset of informative visual features from the dense visual feature map to cover a temporally long range of frames efficiently. Furthermore, the dependency structure within the complex question sentences is also combined with the language embeddings to readily understand the relations among question words. Extensive experiments and ablation studies show that the suggested dense but efficient model outperforms other baselines.-
dc.language영어-
dc.language.isoen-
dc.publisherInstitute of Electrical and Electronics Engineers Inc.-
dc.titleDense but Efficient VideoQA for Intricate Compositional Reasoning-
dc.typeArticle-
dc.contributor.affiliatedAuthorKim, Eun Sol-
dc.identifier.doi10.1109/WACV56688.2023.00117-
dc.identifier.scopusid2-s2.0-85149030587-
dc.identifier.wosid000971500201020-
dc.identifier.bibliographicCitationProceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023, pp.1114 - 1123-
dc.relation.isPartOfProceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023-
dc.citation.titleProceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023-
dc.citation.startPage1114-
dc.citation.endPage1123-
dc.type.rimsART-
dc.type.docTypeProceedings Paper-
dc.description.journalClass1-
dc.description.isOpenAccessN-
dc.description.journalRegisteredClassscopus-
dc.relation.journalResearchAreaComputer Science-
dc.relation.journalResearchAreaEngineering-
dc.relation.journalResearchAreaImaging Science & Photographic Technology-
dc.relation.journalWebOfScienceCategoryComputer Science, Artificial Intelligence-
dc.relation.journalWebOfScienceCategoryEngineering, Electrical & Electronic-
dc.relation.journalWebOfScienceCategoryImaging Science & Photographic Technology-
dc.subject.keywordPlusComputer vision-
dc.subject.keywordPlusAction recognition-
dc.subject.keywordPlusAlgorithm: video recognition and understanding (tracking, action recognition, etc.-
dc.subject.keywordPlusCompositional reasoning-
dc.subject.keywordPlusQuestion Answering-
dc.subject.keywordPlusSimple++-
dc.subject.keywordPlusVideo recognition-
dc.subject.keywordPlusVideo understanding-
dc.subject.keywordPlusVision + language and/or other modality-
dc.subject.keywordPlusVisual feature-
dc.subject.keywordPlusSemantics-
dc.subject.keywordAuthorAlgorithms: Video recognition and understanding (tracking, action recognition, etc.)-
dc.subject.keywordAuthorVision + language and/or other modalities-
dc.identifier.urlhttps://ieeexplore.ieee.org/document/10030999-
Files in This Item
Go to Link
Appears in
Collections
서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kim, Eun Sol photo

Kim, Eun Sol
COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)
Read more

Altmetrics

Total Views & Downloads

BROWSE