Disclaimer: These are my personal notes on this paper. I am in no way related to this paper. All credits go towards the authors.
DeepFaceLab: A simple, flexible and extensible face swapping framework
May 20, 2020 -
Paper Link -
Tags: Deepfake, Framework
Summary
DeepFakeLab (DFL) is a framework that produces deepfakes. The framework is able to produce high quality image deepfakes and comes with many customization abilities.
Notes
DeepFakeLab is an open source pipeline for generating DeepFakes. The pipeline consists of a: face detector module, face recognition module, face alignment module, face parsing module, and a face blending module
Leras: "Lighter Keras". DeepFakeLab is written in Tensorflow (keras had too much overhead in their opinion)
Only requires a source and a destination folder to make deepfakes.
Pipeline consists of three phases: extraction, training, and conversion
Extraction
Face Detector: Uses S3FD as the default face detector
Face Alignment:
Heatmap-based facial landmark algorithm 2DFAN
PRNet with 3D face priori information. Option to smooth facial landmarks of consecutive frames (should increase temporal detection resilience?)
Uses the Umeyama method to calculate a similarity transformation matrix for face alignment. DFL supports front view and left/right side views, calculated via the Euler angle of the obtained facial landmarks
Face Segmentation: Removes facial obstructions (such as hair) via TernausNet
Training
Figure 3 has a network overview.
Loss function: DSSIM (structural dissimilarity) + Mean Squared Error (MSE)
DSSIM generalizes human faces faster than MSE
MSE has better clarity
Figure 4 outlines their srs2dst (source to destination) method.
Equation 1 shows their blending algorithm. They use the re-aligned generated face, the target face, and a target mask
Equation 2 is for Poisson blending optimization
To sharpen the image, they use a pre-trained face super resolution neural network called FaceEnhancer.
For HD images, they use pixelshuffle (depth2space) to perform upsampling.