CLC number: TP309
On-line Access: 2025-10-13
Received: 2024-11-17
Revision Accepted: 2025-02-13
Crosschecked: 2025-10-13
Cited: 0
Clicked: 808
Citations: Bibtex RefMan EndNote GB/T7714
Hui SHI, Guibin WANG, Yanni LI, Rujia QI. Full-defense framework: multi-level deepfake detection and source tracing[J]. Frontiers of Information Technology & Electronic Engineering, 2025, 26(9): 1649-1661.
@article{title="Full-defense framework: multi-level deepfake detection and source tracing",
author="Hui SHI, Guibin WANG, Yanni LI, Rujia QI",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="26",
number="9",
pages="1649-1661",
year="2025",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2401012"
}
%0 Journal Article
%T Full-defense framework: multi-level deepfake detection and source tracing
%A Hui SHI
%A Guibin WANG
%A Yanni LI
%A Rujia QI
%J Frontiers of Information Technology & Electronic Engineering
%V 26
%N 9
%P 1649-1661
%@ 2095-9184
%D 2025
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2401012
TY - JOUR
T1 - Full-defense framework: multi-level deepfake detection and source tracing
A1 - Hui SHI
A1 - Guibin WANG
A1 - Yanni LI
A1 - Rujia QI
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 26
IS - 9
SP - 1649
EP - 1661
%@ 2095-9184
Y1 - 2025
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2401012
Abstract: Deepfake poses significant threats to various fields, including politics, journalism, and entertainment. Although many defense methods against deepfake have been proposed based on either passive detection or proactive defense, few have achieved both passive detection and proactive defense. To address this issue, we propose a full-defense framework (FDF) based on cross-domain feature fusion and separable watermarks (SepMark) to achieve copyright protection and deepfake detection, combining the ideas of passive detection and proactive defense. The proactive defense module consists of one encoder and two separable decoders, where the encoder embeds one watermark into the protected face, and two decoders separately extract two watermarks with different robustness. The robust watermark can reliably trace the trusted marked face while the semi-robust watermark is sensitive to malicious distortions that make the watermark disappear after deepfake or watermark removal attack. The passive detection module fuses spatial- and frequency-domain features to further differentiate between deepfake content and watermark removal attacks in the absence of watermarks. The proposed cross-domain feature fusion involves substituting the “secondary” channels of spatial-domain features with the “primary” channels of frequency-domain features. Subsequently, the “primary” channels of spatial-domain features are used to replace the “secondary” channels of frequency-domain features. Extensive experiments demonstrate that our approach not only offers proactive defense mechanisms by using extracted watermarks, i.e., source tracing and copyright protection, but also achieves passive detection when there are no watermarks, to further differentiate between deepfake content and watermark removal attacks, thereby offering a full-defense approach.
[1]Afchar D, Nozick V, Yamagishi J, et al., 2018. MesoNet: a compact facial video forgery detection network. Proc IEEE Int Workshop on Information Forensics and Security, p.1-7.
[2]Chen H, Lin YZ, Li B, et al., 2023. Learning features of intra-consistency and inter-diversity: keys toward generalizable deepfake detection. IEEE Trans Circ Syst Video Technol, 33(3):1468-1480.
[3]Chen JX, Liao X, Wang W, et al., 2023. SNIS: a signal noise separation-based network for post-processed image forgery detection. IEEE Trans Circ Syst Video Technol, 33(2):935-951.
[4]Cheng H, Guo YY, Wang TY, et al., 2023. Towards generalizable deepfake detection by primary region regularization. https://arxiv.org/abs/2307.12534
[5]Dong JH, Wang Y, Lai JH, et al., 2023. Restricted black-box adversarial attack against deepfake face swapping. IEEE Trans Inform Forens Secur, 18:2596-2608.
[6]Fang H, Jia ZY, Ma ZH, et al., 2022. PIMoG: an effective screen-shooting noise-layer simulation for deep-learning-based watermarking network. Proc 30th ACM Int Conf on Multimedia, p.2267-2275.
[7]Guo ZQ, Jia ZH, Wang LJ, et al., 2024. Constructing new backbone networks via space-frequency interactive convolution for deepfake detection. IEEE Trans Inform Forens Secur, 19:401-413.
[8]Hu J, Liao X, Gao DF, et al., 2024. Delocate: detection and localization for deepfake videos with randomly-located tampered traces. https://arxiv.org/abs/2401.13516
[9]Huang H, Wang YT, Chen ZY, et al., 2022. CMUA-Watermark: a cross-model universal adversarial watermark for combating deepfakes. Proc AAAI Conf on Artificial Intelligence, p.989-997.
[10]Jia ZY, Fan H, Zhang WM, 2021. MBRS: enhancing robustness of DNN-based watermarking by mini-batch of real and simulated JPEG compression. Proc 29th ACM Int Conf on Multimedia, p.41-49.
[11]Li YZ, Yang X, Sun P, et al., 2020. Celeb-DF: a large-scale challenging dataset for deepfake forensics. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.3204-3213.
[12]Li YZ, Sun P, Qi HG, et al., 2024. LandmarkBreaker: a proactive method to obstruct deepfakes via disrupting facial landmark extraction. Comput Vis Image Underst, 240:103935.
[13]Li ZY, Zhang XH, Pu YW, et al., 2023. A survey on multimodal deepfake and detection techniques. J Comput Res Dev, 60(6):1396-1416 (in Chinese).
[14]Liang JH, Shi HF, Deng WH, 2022. Exploring disentangled content information for face forgery detection. Proc 17th European Conf on Computer Vision, p.128-145.
[15]Liao X, Li KD, Zhu XS, et al., 2020. Robust detection of image operator chain with two-stream convolutional neural network. IEEE J Sel Top Signal Process, 14(5):955-968.
[16]Liao X, Wang YM, Wang TY, et al., 2023. FAMM: facial muscle motions for detecting compressed deepfake videos over social networks. IEEE Trans Circ Syst Video Technol, 33(12):7236-7251.
[17]Liu HG, Li XD, Zhou WB, et al., 2021. Spatial-phase shallow learning: rethinking face forgery detection in frequency domain. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.772-781.
[18]Lu ZY, Huang D, Bai L, et al., 2023. Seeing is not always believing: benchmarking human and model perception of AI-generated images. Proc 37th Conf on Neural Information Processing Systems.
[19]Ma R, Guo MX, Hou Y, et al., 2022. Towards blind watermarking: combining invertible and non-invertible mechanisms. Proc 30th ACM Int Conf on Multimedia, p.1532-1542.
[20]Neekhara P, Hussain S, Zhang XQ, et al., 2024. FaceSigns: semi-fragile watermarks for media authentication. ACM Trans Multim Comput Commun Appl, 20(11):337.
[21]Qian YY, Yin GJ, Sheng L, et al., 2020. Thinking in frequency: face forgery detection by mining frequency-aware clues. https://arxiv.org/abs/2007.09355
[22]Rössler A, Cozzolino D, Verdoliva L, et al., 2019. FaceForensics++: learning to detect manipulated facial images. IEEE/CVF Int Conf on Computer Vision, p.1-11.
[23]Ruiz N, Bargal SA, Sclaroff S, 2020. Disrupting deepfakes: adversarial attacks against conditional image translation networks and facial manipulation systems. Proc 16th Europe Conf on Computer Vision, p.236-251.
[24]Shao R, Wu TX, Nie LQ, et al., 2023. DeepFake-Adapter: dual-level adapter for deepfake detection. https://arxiv.org/abs/2306.00863
[25]Sun P, Li YZ, Qi HG, et al., 2022. FakeTracer: exposing DeepFakes with training data contamination. Proc IEEE Int Conf on Image Processing, p.1161-1165.
[26]Tan CC, Zhao Y, Wei SK, et al., 2024. Frequency-aware deepfake detection: improving generalizability through frequency space learning. https://arxiv.org/abs/2403.07240
[27]Thies J, Zollhöfer M, Nießner M, 2019a. Deferred neural rendering: image synthesis using neural textures. ACM Trans Graph, 38(4):66.
[28]Thies J, Zollhöfer M, Stamminger M, et al., 2019b. Face2Face: real-time face capture and reenactment of RGB videos. Commun ACM, 62(1):96-104.
[29]Tian CW, Zheng MH, Jiao TC, et al., 2024. A self-supervised CNN for image watermark removal. IEEE Trans Circ Syst Video Technol, 34(8):7566-7576.
[30]Wang J, Sun YL, Tian JH, 2022. LiSiam: localization invariance Siamese network for deepfake detection. IEEE Trans Inform Forens Secur, 17:387-398.
[31]Wang SY, Wang O, Zhang R, et al., 2020. CNN-generated images are surprisingly easy to spot for now. IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.8692-8701.
[32]Wang TY, Chow KP, 2023. Noise based deepfake detection via multi-head relative-interaction. Proc AAAI Conf on Artificial Intelligence, 37(12):14548-14556.
[33]Wodajo D, Atnafu S, 2021. Deepfake video detection using convolutional vision Transformer. https://arxiv.org/abs/2102.11126
[34]Wu XS, Liao X, Ou B, 2023. Separable watermark module for dual-domain fusion in deepfake detection. Proc Int Conf on Cyber Security and Privacy, p.234-245.
[35]Xu Y, Raja K, Verdoliva L, et al., 2023. Learning pairwise interaction for generalizable deepfake detection. Proc IEEE/CVF Winter Conf on Applications of Computer Vision Workshops, p.1-11.
[36]Yang JC, Li AY, Xiao S, et al., 2021. MTD-Net: learning to detect deepfakes images by multi-scale texture difference. IEEE Trans Inform Forens Secur, 16:4234-4245.
[37]Zhang DC, Xiao ZH, Li SK, et al., 2024. Learning natural consistency representation for face forgery video detection. https://arxiv.org/abs/2407.10550
[38]Zhao Y, Liu B, Ding M, et al., 2023. Proactive deepfake defence via identity watermarking. Proc IEEE/CVF Winter Conf on Applications of Computer Vision, p.4591-4600.
Open peer comments: Debate/Discuss/Question/Opinion
<1>