Triple-modality interaction for deepfake detection on zero-shot identity

Yoon, Junho; Panizo-LLedot, Angel; Camacho, David; Choi, Chang

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Triple-modality interaction for deepfake detection on zero-shot identity

Authors: Yoon, Junho; Panizo-LLedot, Angel; Camacho, David; Choi, Chang

Issue Date: Sep-2024

Publisher: ELSEVIER

Keywords: Multi-modal; One-shot; Deepfake; Disinformation detection

Citation: INFORMATION FUSION, v.109

Journal Title: INFORMATION FUSION

Volume: 109

URI: https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/91751

DOI: 10.1016/j.inffus.2024.102424

ISSN: 1566-2535
1872-6305

Abstract: Recent advancements in generative AI technology have created more realistic fake data that are utilized in various fields, such as data augmentation. However, the misuse of deepfake technology has led to increased damage. Consequently, ongoing research aims to analyze modality characteristics and detect deepfakes through AI -based methods. Existing AI -based deepfake-detection techniques have limitations in detecting deepfakes in modalities and identities that are not included in the training data. This study proposes a baseline approach based on zero-shot identity and one-shot deepfake detection for detecting deepfakes in environments with limited data. Additionally, we propose a triple-modality interaction based on a multimodal transformer (TMIFormer) to consider the triple-modality aspects of deepfakes. TMI-Former comprises four stages: vision feature extraction, representation, residual connection, and late-level fusion. It operates in a two-stage manner, extracting visual features and reconstructing them using auditory and linguistic features, thereby allowing for triple-modality interactions. In environments with limited data, such as zero-shot identity and one-shot deepfake scenarios, TMI-Former demonstrated effectiveness, with an accuracy ranging from 18.75% to 19.5% and an f1 -score ranging from 0.2238 to 0.3561, compared to unimodal AI. Furthermore, TMI-Former shows superior performance compared to the existing multi -modal AI, with an accuracy ranging from 1.44% to 19.75% and an f1 -score ranging from 0.0146 to 0.4169.

Files in This Item: There are no files associated with this item.

Appears in Collections: ETC > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Choi, Chang photo

Choi, Chang: College of IT Convergence (컴퓨터공학부(컴퓨터공학전공))

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :4,169,487; Today View :31,294

RSS_1.0 RSS_2.0 ATOM_1.0

1342, Seongnam-daero, Sujeong-gu, Seongnam-si, Gyeonggi-do, Republic of Korea(13120)031-750-5114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE