AI voice-cloning tools can spread misinformation on social media

In a video from a January 25 news report, President Joe Biden talks about tanks. But an edited version of the video, which has garnered hundreds of thousands of views on social media this week, appears to have delivered a speech that attacks transgender people.

Digital forensics experts say the video was created using a new generation of artificial intelligence tools, which allow anyone to quickly generate audio that simulates a person’s voice with just a few clicks of a button . and while Biden While the clip on social media may have failed to fool most users this time, the clip shows how easy it is now for people to create “deepfake” videos filled with hateful and misinformation that can harm the real world.

“Tools like this are basically going to add more fuel to the fire,” said Hafiz Malik, a professor of electrical and computer engineering at the University of Michigan who focuses on multimedia forensics. “The monster is already on the loose.”

It came last month with the beta phase of ElevenLabs’ voice synthesis platform, which allowed users to generate realistic audio of any person’s voice by uploading audio samples of a few minutes and typing any text they want to say.

The startup says the technology was developed to dub audio in different languages ​​for movies, audiobooks and gaming, while preserving the voice and emotion of the speaker.

Social media users quickly began sharing fake audio clips of AI-generated audio samples of Hillary Clinton reading the transphobic text shown in the Biden clip. Bill Gates The COVID-19 vaccine is believed to cause AIDS and actress Emma Watson is reportedly reading Hitler’s manifesto “Mein Kampf”.

Shortly after, ElevenLabs tweeted that it was seeing an “increasing number of abuse cases of voice cloning”, and announced that it was now exploring safeguards to reduce abuse. One of the first steps was to make this feature available only to people who provide payment information. Initially, anonymous users were able to access the voice cloning tool for free. The company also claims that if there is a problem, it can trace any generated audio back to the creator.

But even the ability to track creators won’t mitigate the harm of the tool, said Hany Farid, a professor at the University of California, Berkeley, who focuses on digital forensics and misinformation.

“The damage is done,” he said.

As an example, Farid said that bad actors can flood the stock market with fake audio of a top CEO saying that profits are low. and there’s already a clip on youtube Biden said that the US is launching a nuclear attack against Russia.

Free and open-source software with similar capabilities has also sprung up online, meaning that paying is no longer a barrier to commercial equipment. Using a free online model, AP generated audio samples to sound like actors Daniel Craig and Jennifer Lawrence in just a few minutes.

“The question is where to point the finger and how to put the genie back in the bottle?” Malik said. “We can’t do it.”

When deepfakes first made headlines about five years ago, they were fairly easy to spot because the subject didn’t blink and the audio sounded robotic. This is no longer the case as equipment has become more sophisticated.

For example, the altered video of Biden making derogatory remarks about transgender people juxtaposed AI-generated audio with an actual clip of the president, taken from a January 25 CNN live broadcast, in which Ukraine was threatened by the US. It was announced to send the tank. Biden’s mouth was doctored in the video to match the audio. while most Twitter Users recognized that the content was not something Biden was likely to say, yet they were surprised by how realistic it appeared. Others thought it was real – or at least didn’t know what to believe.

Farid said Hollywood studios have long been able to distort reality, but have democratized access to that technology without considering the implications.

“It’s very, very powerful AI-based technology, ease of use, and then the fact that the model looks like: Let’s put it on the Internet and see what happens next,” Farid said.

Audio is just one area where AI-generated misinformation poses a threat.

Free online AI image generators such as mid journey And Dal-e One can churn out photorealistic images of war and natural disasters in the style of legacy media outlets with a simple text prompt. Last month, some school districts in the US began blocking chatgptWhich can produce readable text – such as student term papers – on demand.

ElevenLabs did not respond to a request for comment.


Affiliate links may be automatically generated – see our moral statement for information.