Detailed Information

Cited 0 time in webofscience Cited 1 time in scopus
Metadata Downloads

Multimedia analysis of robustly optimized multimodal transformer based on vision and language co-learning

Authors
Yoon, JunHoChoi, GyuHoChoi, Chang
Issue Date
Dec-2023
Publisher
ELSEVIER
Keywords
Multi-modal; Multimedia; Natural disasters; Classification
Citation
INFORMATION FUSION, v.100
Journal Title
INFORMATION FUSION
Volume
100
URI
https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/89120
DOI
10.1016/j.inffus.2023.101922
ISSN
1566-2535
Abstract
Recently, research on multimodal learning using all modality information has been conducted to detect disinformation on multimedia. Existing multimodal learning methods include score-level fusion approaches combining different models, and feature-level fusion methods combining embedding vectors to integrate data of different dimensions. Because a late-level fusion method is combined after the modalities are individually operated, there is a limit in that the recognition performance of a unimodal determines the performance. In addition, a fusion method has constraints in that the data among the modalities must be matched. In this study, we propose a classification system using a RoBERTa-based multimodal fusion transformer (RoBERTaMFT) that applies a co-learning method to solve the limitations of the recognition performance of multimodal learning as well as the data imbalance among the modalities. RoBERTaMFT consists of image feature extraction, co learning using the reconstruction of image features with text embedding, and a late-level fusion step applied to the final classification. As experiment results using the CrisisMMD dataset indicate, RoBERTaMFT achieved an accuracy 21.2% and an f1-score 0.414 higher than those of unimodal learning, and an accuracy 11.7% and an f1-score 0.268 higher than those of existing multimodal learning.
Files in This Item
There are no files associated with this item.
Appears in
Collections
IT융합대학 > 컴퓨터공학과 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Choi, Chang photo

Choi, Chang
College of IT Convergence (컴퓨터공학부(컴퓨터공학전공))
Read more

Altmetrics

Total Views & Downloads

BROWSE