CE-BART: Cause-and-Effect BART for Visual Commonsense Generationopen access
- Authors
- Kim, Junyeong; Hong, Ji Woo; Yoon, Sunjae; Yoo, Chang D.
- Issue Date
- Dec-2022
- Publisher
- MDPI
- Keywords
- deep learning; visual-language reasoning; visual commonsense generation; video-grounded dialogue; VisualCOMET; AVSD
- Citation
- SENSORS, v.22, no.23
- Journal Title
- SENSORS
- Volume
- 22
- Number
- 23
- URI
- https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/61132
- DOI
- 10.3390/s22239399
- ISSN
- 1424-8220
1424-3210
- Abstract
- "A Picture is worth a thousand words". Given an image, humans are able to deduce various cause-and-effect captions of past, current, and future events beyond the image. The task of visual commonsense generation has the aim of generating three cause-and-effect captions for a given image: (1) what needed to happen before, (2) what is the current intent, and (3) what will happen after. However, this task is challenging for machines, owing to two limitations: existing approaches (1) directly utilize conventional vision-language transformers to learn relationships between input modalities and (2) ignore relations among target cause-and-effect captions, but consider each caption independently. Herein, we propose Cause-and-Effect BART (CE-BART), which is based on (1) a structured graph reasoner that captures intra- and inter-modality relationships among visual and textual representations and (2) a cause-and-effect generator that generates cause-and-effect captions by considering the causal relations among inferences. We demonstrate the validity of CE-BART on the VisualCOMET and AVSD benchmarks. CE-BART achieved SOTA performance on both benchmarks, while an extensive ablation study and qualitative analysis demonstrated the performance gain and improved interpretability.
- Files in This Item
-
- Appears in
Collections - College of Software > Department of Artificial Intelligence > 1. Journal Articles
![qrcode](https://api.qrserver.com/v1/create-qr-code/?size=55x55&data=https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/61132)
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.