Tutorial @ ICME 2024 Jul 15, 2024
You (Neil) Zhang, University of Rochester, [email protected]
Luchuan Song, University of Rochester, [email protected]
Menglu Li, Toronto Metropolitan University, [email protected]
Zhiyao Duan, University of Rochester
Chenliang Xu, University of Rochester
Xiao-Ping Zhang, Toronto Metropolitan University
The rapid advancement of Deepfake technology, particularly influenced by the intervention of large language models and 3D generation technology, is raising significant concerns across various sectors due to its potential misuse in creating deceptive audio-visual content. The potential misuse of Deepfakes for creating deceptive content has prompted an urgent need for effective detection methodologies. Our tutorial, "Multimedia Deepfake Detection," aims to address this challenge by convening experts from different but related research communities focusing on current challenges brought by Deepfake. The primary objective is to foster cross-disciplinary collaboration, exchanging insights and methodologies to enhance the effectiveness of audio-visual Deepfake detection techniques.
Li, Menglu, Yasaman Ahmadiadli, and Xiao-Ping Zhang. "Audio Anti-Spoofing Detection: A Survey." arXiv 2024
Khalid, Hasam, et al. “FakeAVCeleb: A novel audio-video multimodal deepfake dataset.” NeurIPS Datasets Track 2021.
Korshunov, Pavel, et al. “Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes.” IJCB 2023.
Hou, Yang, et al. "PolyGlotFake: A Novel Multilingual and Multimodal DeepFake Dataset." arXiv 2024.
Cai, Zhixi, et al. "AV-Deepfake1M: A large-scale LLM-driven audio-visual deepfake dataset." arXiv 2023.
Mittal, Trisha, et al. "Video manipulations beyond faces: A dataset with human-machine analysis." WACV 2023.
Zhou, Yipin, and Ser-Nam Lim. "Joint audio-visual deepfake detection." ICCV 2021.
Yang, Wenyuan, et al. “Avoid-df: Audio-visual joint learning for detecting deepfake.” TIFS 2023.
Zou, Heqing, et al. "Cross-Modality and Within-Modality Regularization for Audio-Visual Deepfake Detection." ICASSP 2024.
Zhang, Yibo, Weiguo Lin, and Junfeng Xu. “Joint audio-visual attention with contrastive learning for more general deepfake detection.” TOMM 2024.
Yu, Cai, et al. “Explicit Correlation Learning for Generalizable Cross-Modal Deepfake Detection.” ICME 2024.
Oorloff, Trevine, et al. "AVFF: Audio-Visual Feature Fusion for Video Deepfake Detection." CVPR 2024.
Emotions don‘t lie: An audio-visual deepfake detection method using affective cues.” ACM MM 2020.
Cozzolino, Davide, et al. "Audio-visual person-of-interest deepfake detection." CVPRW 2023.
Cheng, Harry, et al. “Voice-face homogeneity tells deepfake.” TOMM 2023.
Feng, Chao, Ziyang Chen, and Andrew Owens. "Self-supervised video forensics by audio-visual anomaly detection." CVPR 2023.
Shahzad, Sahibzada Adil, et al. "AV-Lip-Sync+: Leveraging AV-HuBERT to exploit multimodal inconsistency for video deepfake detection." arXiv 2023.
Bohacek, Matyas, and Hany Farid. "Lost in Translation: Lip-Sync Deepfake Detection from Audio-Video Mismatch." CVPRW 2024.
Liu, Miao, et al. “Audio-visual temporal forgery detection using embedding-level fusion and multi-dimensional contrastive loss.” TCSVT 2023.
Yin, Qilin, et al. "Fine-Grained Multimodal DeepFake Classification via Heterogeneous Graphs." IJCV 2024.
Nguyen, Tai D., Shengbang Fang, and Matthew C. Stamm. "Videofact: detecting video forgeries using attention, scene context, and forensic traces." WACV 2024.