Masked Relation Learning for DeepFake Detection

IEEE Transactions on Information Forensics and Security. (TIFS) 2023.02.14,

Ziming Yang, Jian Liang, Yuting Xu, Xiao-Yu Zhang*, Ran He.

Abstract

DeepFake detection aims to differentiate falsified faces from real ones. Most approaches formulate it as a binary classification problem by solely mining the local artifacts and inconsistencies of face forgery, which neglect the relation across local regions. Although several recent works explore local relation learning for DeepFake detection, they overlook the propagation of relational information and lead to limited performance gains. To address these issues, this paper provides a new perspective by formulating DeepFake detection as a graph classification problem, in which each facial region corresponds to a vertex. But relational information with large redundancy hinders the expressiveness of graphs. Inspired by the success of masked modeling, we propose Masked Relation Learning which decreases the redundancy to learn informative relational features. Specifically, a spatiotemporal attention module is exploited to learn the attention features of multiple facial regions. A relation learning module masks partial correlations between regions to reduce redundancy and then propagates the relational information across regions to capture the irregularity from a global view of the graph. We empirically discover that a moderate masking rate (e.g., 50%) brings the best performance gain. Experiments verify the effectiveness of Masked Relation Learning and demonstrate that our approach outperforms the state of the art by 2% AUC on the cross-dataset DeepFake video detection. Code will be available at https://github.com/zimyang/MaskRelation.