homehome Home chatchat Notifications


AI turns speech into portraits -- and it's eerily accurate sometimes

It's the racist uncle of computers.

Tibi Puiu
April 5, 2022 @ 8:16 pm

share Share

Some law enforcement and intelligence agencies across the world are experimenting with artificial intelligence to identify people using voice recordings and big data. But what if there’s no match? AI systems can still provide valuable input. For instance, researchers at MIT have demonstrated a novel machine-learning algorithm that can synthesize a person’s portrait based solely on a short audio recording of their voice.

Researchers at MIT’S Computer Science and Artificial Intelligence Laboratory (CSAIL) first trained their deep neural network by feeding it millions of videos scraped from YouTube showing people speaking in front of the camera. This huge swath of data trained the AI to correlate certain sound characteristics with facial features corresponding to age, gender, or ethnicity.

Throughout this process, there was no human intervention apart from the initial task of correlating voice and facial features. The AI learned all of this by itself with no external supervision, not even the labeling of subsets of data.

To test the system, the researchers designed a face decoder that reconstructs a speaker’s face from a still frame, regardless of its lighting or pose. This digital reconstruction was then compared to synthesized portraits solely from a speaker’s voice, with striking results that you can see in these images.

The synthesized faces are generic, meaning the produced images are not of specific individuals as those produced by the decoder. Nevertheless, they still manage to capture the basic facial features of a speaker such as skin color, gender, and age. The longer the voice recordings were, the more accurate the synthesized portrait proved to be.

But there were also plenty of mismatches. High-pitched voices were often identified as female, even in cases when they came from males, such as young boys. Asian men who spoke in American English had portraits resembling white males, but this did not happen when the Asian voice spoke in Chinese.

The AI may remind some of their racist uncle, and the researchers are aware of these biases and are looking to overcome these limitations. Improving the system’s accuracy is a matter of providing more training data that is representative of the general population.

Until these limitations are addressed, real-world applications of this AI system should be treated with care. One possible use could be to make interactions between humans and machines more appealing. Machine-generated voices used by home devices and virtual assistants could now be given an appropriate face. Law enforcement could use this neural network to generate a portrait of a suspect when the only evidence is a voice recording. However, any government use will be sure to be met with criticism surrounding privacy and ethics.

“We believe that generating faces, as opposed to predicting specific attributes, may provide a more comprehensive view of voice face correlations and can open up new research opportunities and applications,” the MIT authors wrote in the description of their project called Speech2Face.

[via PetaPixel]

share Share

The Universe’s First “Little Red Dots” May Be a New Kind of Star With a Black Hole Inside

Mysterious red dots may be a peculiar cosmic hybrid between a star and a black hole.

Peacock Feathers Can Turn Into Biological Lasers and Scientists Are Amazed

Peacock tail feathers infused with dye emit laser light under pulsed illumination.

Helsinki went a full year without a traffic death. How did they do it?

Nordic capitals keep showing how we can eliminate traffic fatalities.

Scientists Find Hidden Clues in The Alexander Mosaic. Its 2 Million Tiny Stones Came From All Over the Ancient World

One of the most famous artworks of the ancient world reads almost like a map of the Roman Empire's power.

Ancient bling: Romans May Have Worn a 450-Million-Year-Old Sea Fossil as a Pendant

Before fossils were science, they were symbols of magic, mystery, and power.

This AI Therapy App Told a Suicidal User How to Die While Trying to Mimic Empathy

You really shouldn't use a chatbot for therapy.

This New Coating Repels Oil Like Teflon Without the Nasty PFAs

An ultra-thin coating mimics Teflon’s performance—minus most of its toxicity.

Why You Should Stop Using Scented Candles—For Good

They're seriously not good for you.

People in Thailand were chewing psychoactive nuts 4,000 years ago. It's in their teeth

The teeth Chico, they never lie.

To Fight Invasive Pythons in the Everglades Scientists Turned to Robot Rabbits

Scientists are unleashing robo-rabbits to trick and trap giant invasive snakes