We currently run two detectors, all based on neural networks running on the detected faces of a given video.
β
One detection model looks at statistical patterns at various scales of the face and matches them with the patterns it learned from a large dataset containing all kinds of deepfakes and their real counterparts. Based on these patterns, the model is able to discriminate between the two classes.
β
Another detector is trained over a very large dataset of real videos which are used to create auto-deepfakes (i.e. extracting a subject's face, performing a random geometric perturbation like rotation and resizing and re-stitching it over the original face). This training technique allows training a statistical detector very similar to the previous one, but without needing to use the actual deepfakes. The result is a less data-hungry model which is capable of discerning between the two classes.