The making of a deepfake is typically a covert affair, with the creator concealing their identity behind their computer screen or a cyber pseudonym. Not so when producers from British TV network Channel 4 released behind-the-scenes footage of how they’d created a deepfake of Queen Elizabeth performing a TikTok dance routine and joking about her favorite pastime of “Netflix and Phil.”
The queen was portrayed by British actress Debra Stephenson in a video that was supposed to offer “a stark warning about the advanced technology that is enabling the proliferation of misinformation and fake news in a digital age.”
Despite their ability to generate transformative digital effects, deepfakes have seen limited uptake in Hollywood. Most deepfakes are experimental and made more for shock value or to raise awareness of the lack of regulation around misinformation circulating online. A report from DeepTrace labs found that there were over 14,000 deepfake videos circulating online as of September 2019, 96% of which were pornographic.
“As far as deepfakes go, there aren’t specific intellectual property laws governing its use,” said Mark Adams, a cybersecurity expert and a mentor for Springboard’s Cyber Security Career Track. “But there are definitely IP laws around fair use. Is it fair use or is it not? If you’re not making money off of it, is it still illegal? It’s murky at best.”
Using image-to-image translation in filmmaking
A deepfake is an image, video or audio track that has been manipulated using deep learning to mimic another person’s speech or likeness. This type of image-to-image translation can replace faces, manipulate facial expressions and even synthesize faces and speech. The simplest iteration of image-to-image translation is face swapping, as seen in Snapchat filters, where one person’s face is transposed onto that of another person or animal for comic effect. Given a large enough dataset, however, an unsupervised machine learning algorithm can generate a deepfake, which does a far more believable job of synthesizing images to mimic someone’s mannerisms. “Right now, the real targets are people like prominent politicians and the CEOs of major corporations—like a Tom Hanks or a Bill Gates,” said Adams.
Image-to-image translation takes images from one domain and transforms them so they have the style (or characteristics) of images from another domain, such as a photo of the Eiffel Tower rendered à la Van Gogh’s “Starry Night.” This is done using a Generative Adversarial Network (GAN), which consists of two machine learning models working antagonistically: a generator and a discriminator. The generator attempts to generate new examples or derivations of the input data that can pass for real data, while the discriminator classifies these outputs as real or fake. The ultimate goal is for the algorithm to generate a new image so believable it “fools” the discriminator.
While such transformative visual effects might seem like a boon to Hollywood film productions, which pay VFX studios millions of dollars to insert believable dragons, spaceships and aliens into live action footage using labor-intensive 3D layering, deepfakes have barely gained ground in Hollywood. Computer-generated imagery (CGI) is still the preferred visual effect, even though deepfakes can be made by anyone with sufficient computing power and access to open source deepfake software like Deepfake App (available under a “freemium” plan) or DeepFaceLab, which is open sourced on GitHub.
Hollywood films have even begun using “full-body” CGI to insert deceased actors into scenes, such as the return of Carrie Fisher in the recent Star Wars: Rogue One, in which unused scenes of her character, Princess Leia, were taken from The Force Awakens to close the character’s story in Rise. In fact, Finding Jack, a Vietnam-era action drama released in 2020 stars actor James Dean, who died in 1955. Actor Will Smith’s younger clone in 2019’s Gemini Man marked a major advance in VFX technology, “the industry’s most believable digital human,” according to IndieWire. VFX artists said that since the actor had aged so well, they couldn’t simply digitally manipulate a sagging cheek or drooping jowl to create the appearance of youth, or systematically soften the appearance of crow’s feet.
“We had to do a really deep dive into understanding what youth really means,” Guy Williams, a VFX supervisor at Weta Digital, told the publication.
In these instances, film producers need to have a wide variety of footage of the actor in order to convincingly recreate them. Deepfake machine learning models, on the other hand, simply need to be trained on a large enough image dataset with a human actor to serve as a stand-in, making it possible to recreate the image of a deceased actor with fewer restrictions.
Can deepfake tech make an impact in Hollywood?
Until now, deepfakes are still too low-resolution for the big screen. Open source deepfake software can only create videos at a maximum resolution of 256 x 256 pixels, whereas the majority of theaters use 2K digital image projection, which is a container with a resolution of 2048 x 1080 pixels. However, scientists affiliated with Disney Research Studios are working on a model that can produce video with a 1024 x 1024 resolution, suggesting that the technology could be headed for the silver screen.
For these reasons, deepfakes in the entertainment industry have remained largely limited to fan made videos thus far—which are typically released in response to poorly executed CGIs. Martin Scorcesese’s The Irishman (2019) saw actor Robert De Niro digitally de-aged to appear thirty years younger, but the face swap failed to conceal the fact that De Niro, who was in his seventies at the time of filming, didn’t move with the dexterity of the young man he was digitally impersonating. Shortly after The Irishman’s release, a fanmade deepfake emerged on a YouTube account run by a deepfake creator known only as Shamook, who ran footage from the film through deepfake software. Shamook’s creation shows a noticeably more fresh-faced De Niro compared to the Netflix CGI, in which the actor’s lined face looks like present-day De Niro but with darker hair.
Deepfake hobbyists have a penchant for rolling out remakes that seem far superior to the original CGI-manipulated footage and were produced at a fraction of the cost. CGI uses motion tracking technology to track how the face moves from various angles when an actor is speaking, crying, shouting and so on to recreate these movements in 3D and layer them over live action footage. A deepfake, on the other hand, is trained on an image dataset consisting of thousands of photographs, and generates an entire fabrication on its own. The more images it is given, the more believable the output.
Even more basic cosmetic changes to film footage generally turn out better using deepfakes than CGI. Take, for example, the viral video of actor Henry Cavill’s mustache edited out of Warner. Bros’ “Justice League” using only a $500 computer and an AI algorithm, which some said did a better job than the studio’s CGI department.
Fans have even released deepfakes that swap out the original actor for another, such as Will Smith replacing Keanu Reeves in The Matrix, Jim Carrey as Jack Nicholson in The Shining or Tom Cruise as Christian Bale in American Psycho.
One particularly dystopian outlook on what could happen if deepfakes become mainstream in Hollywood is that actors could technically be replaced altogether by digital recreations of their likeness.
“If Tom Cruise wants too much money, you could just put a fake Tom Cruise out there knowing nobody is going to know the difference,” said Adams. “That way the [film producer] doesn’t have to pay him $10 million to make the next Mission Impossible.”
Instead of spending their time filming scenes, actors could get rich without lifting a finger. They could earn licensing fees on their personal image by releasing the rights to thousands of images that could be used to train a GAN, saving studios millions of dollars.
Audio deepfakes, a form of audio reshuffling that’s been compared to “Photoshop for audio” can be used for language dubbing to replace language actors while preserving the original actor’s voice for a more believable viewing experience. Video dialog replacement involves using an actor’s mouth movements to manipulate someone else’s mouth in existing footage.
However, given that AI is capable of generating entirely new human faces and not just altering real ones, it is technically possible to make a film with an all-digital cast, where human actors essentially function as body doubles but are never seen in the final product. deepfake casting and acting are relatively new, but as deepfakes have grown increasingly realistic, filmmakers have begun using them in TV ads and broadcast-quality productions.
Deepfakes in the mainstream
Mischief USA, a creative agency, produced a pair of ads for a voting rights campaign featuring deepfaked versions of North Korean leader Kim Jong-un and Russian president Vladimir Putin. In this case, the casting process hinged on what training data was already available to teach the model. For Kim, most of his televised speeches showed him wearing glasses, which obscured his face and caused the algorithm to break down, so finding an actor who resembled him was more important. When it came to casting Putin, there was already plenty of footage available online of him giving speeches from various angles, so the producers had more leeway.
To find the right actor, the team ran their casting tapes through DeepFaceLab to find the one that was the most convincing. “They were effectively acting as a human shield,” Ryan Laney, a visual effects artist involved in the project, told Technology Review.
While deepfake creation was once limited to programmers with knowledge of Python and unsupervised machine learning, deepfake software went viral in 2019 with the release of a mobile app called Zao, which lets people edit themselves into popular films in under eight seconds using just a single photo. However, users could only choose from a set of preselected clips lasting just a few seconds long to avoid copyright infringement. The app developer had likely trained their algorithms on each of these clips to easily re-map a user’s face onto them.
While deepfake detection tools are not yet considered part of an organization’s core cybersecurity infrastructure, Adams says they could become more of a threat as the technology grows more sophisticated.
“At the moment, deepfake detection never factors into penetration testing, vulnerability testing or anything like that,” he said. “However, they would be important for a physical security and a PR perspective or marketing.”
Deepfakes are used mostly for entertainment purposes, such as people inserting themselves into movie scenes or casting their favorite actor in an iconic scene, but outside of entertainment, they can also serve as a form of propaganda. In 2018, comedian Jordan Peele released a deepfake of former president Barack Obama, in which he used his own voice to supplant Obama’s, where he called then-president Trump “a complete and total dips**t.” Not long after, two artists put out a deepfake of Facebook founder Mark Zuckerberg bragging about exerting “total control” over billions of people’s personal data. In both instances, however, the creators made it explicitly known that the videos they released were deepfakes of their own making.
“If you look at the science of propaganda, you’re not trying to convince everyone that what they’re saying is true and accurate—they simply need to convince enough people,” said Adams. “So it becomes the tyranny of the majority.”
Does image-to-image translation intrigue you? Are you interested in machine learning? Check out Springboard’s Machine Learning Engineering Career Track to build your career in this challenging domain. The 6 months career track program offers a 1:1 mentoring-led, project-driven curriculum along with personal career coaching that would help you acquire job-ready skills.