homehome Home chatchat Notifications


Voice mimicking AI dupes Alexa and other voice recognition devices

These algorithms open up a new avenue of scamming and hacking.

Tibi Puiu
October 14, 2021 @ 5:48 pm

share Share

Credit: Pixabay.

Deepfakes (a portmanteau of “deep learning” and “fake“) are synthetic media in which a real person’s pictures, videos, or speech are converted into someone else’s (often a celebrity’s) artificial, AI-generated likeness. You may have come across some on the internet before, such as Tom Cruise deepfakes on Tik Tok or Joe Rogan voice clones.

While image and video varieties are more convincing, the impression was that audio deepfakes have lagged behind — not without copious amounts of training audio, at least. But a new study serves as a wake-up call, showing that voice copying algorithms that are easy to find on the internet are already pretty good. In fact, the researchers found that with minimal amounts of training, these algorithms can fool voice recognition devices, such as Amazon’s Alexa.

Researchers at the University of Chicago’s Security, Algorithms, Networking and Data (SAND) Lab tested two of the most popular deepfake voice synthesis algorithms — SV2TTS and AutoVC — both of which are open-source and freely available on Github.

The two programs are known as ‘real-time voice cloning toolboxes’. The developers of SV2TTS boast that only five seconds’ worth of training recordings are enough to generate a passable imitation.

The researchers put both systems to the test by feeding them the same 90 five-minute voice recordings of different people talking. They also recorded their own samples from 14 volunteers, who were asked for permission to see whether the computer-generated voices could unlock their voice recognition devices, such as Microsoft Azure, WeChat, and Amazon Alexa.

SV2TTS was able to trick Microsoft Azure about 30 percent of the time but got the best of both WeChat and Amazon Alexa almost two-thirds, or 63 percent, of the time. A hacker could use this to log into WeChat with a synthetic vocal message mimicking the real user or access a person’s Alexa to make payments to third-party apps.

AutoVC performed quite poorly, being able to fool Microsoft Azure only 15 percent of the time. Since it fell short of expectations, the researchers didn’t bother to test it against WeChat and Alexa voice recognition security.

In another experiment, the researchers enlisted 200 volunteers who were asked to listen to pairs of recordings and identify which of two they thought was fake. The volunteers were tricked nearly half the time, which made their judgments no better than a coin toss.

The most convincing deepfake audios were those mimicking women’s voices and those of non-native English speakers. This is something that researchers are currently looking into.

‘We find that both humans and machines can be reliably fooled by synthetic speech and that existing defenses against synthesized speech fall short,’ the researchers wrote in a report posted on the open-access server arXiv

‘Such tools in the wrong hands will enable a range of powerful attacks against both humans and software systems [aka machines].’

In 2019, a scammer performed an ‘AI heist’, using deep fake voice algorithms to impersonate a German executive at an energy company and convince employees to wire him $240,000. According to the Washington Post, the person who performed the wire transfer found it odd that their boss would make such a request, but the German accent and familiar voice heard over the phone was convincing. Cybersecurity firm Symantec says it has identified similar cases of deepfake voice scams that resulted in losses in the millions of dollars. 

share Share

A Soviet shuttle from the Space Race is about to fall uncontrollably from the sky

A ghost from time past is about to return to Earth. But it won't be smooth.

The world’s largest wildlife crossing is under construction in LA, and it’s no less than a miracle

But we need more of these massive wildlife crossings.

Your gold could come from some of the most violent stars in the universe

That gold in your phone could have originated from a magnetar.

Ronan the Sea Lion Can Keep a Beat Better Than You Can — and She Might Just Change What We Know About Music and the Brain

A rescued sea lion is shaking up what scientists thought they knew about rhythm and the brain

Did the Ancient Egyptians Paint the Milky Way on Their Coffins?

Tomb art suggests the sky goddess Nut from ancient Egypt might reveal the oldest depiction of our galaxy.

Dinosaurs Were Doing Just Fine Before the Asteroid Hit

New research overturns the idea that dinosaurs were already dying out before the asteroid hit.

Denmark could become the first country to ban deepfakes

Denmark hopes to pass a law prohibiting publishing deepfakes without the subject's consent.

Archaeologists find 2,000-year-old Roman military sandals in Germany with nails for traction

To march legionaries across the vast Roman Empire, solid footwear was required.

Mexico Will Give U.S. More Water to Avert More Tariffs

Droughts due to climate change are making Mexico increasingly water indebted to the USA.

Chinese Student Got Rescued from Mount Fuji—Then Went Back for His Phone and Needed Saving Again

A student was saved two times in four days after ignoring warnings to stay off Mount Fuji.