homehome Home chatchat Notifications


AI turns speech into portraits -- and it's eerily accurate sometimes

It's the racist uncle of computers.

Tibi Puiu
April 5, 2022 @ 8:16 pm

share Share

Some law enforcement and intelligence agencies across the world are experimenting with artificial intelligence to identify people using voice recordings and big data. But what if there’s no match? AI systems can still provide valuable input. For instance, researchers at MIT have demonstrated a novel machine-learning algorithm that can synthesize a person’s portrait based solely on a short audio recording of their voice.

Researchers at MIT’S Computer Science and Artificial Intelligence Laboratory (CSAIL) first trained their deep neural network by feeding it millions of videos scraped from YouTube showing people speaking in front of the camera. This huge swath of data trained the AI to correlate certain sound characteristics with facial features corresponding to age, gender, or ethnicity.

Throughout this process, there was no human intervention apart from the initial task of correlating voice and facial features. The AI learned all of this by itself with no external supervision, not even the labeling of subsets of data.

To test the system, the researchers designed a face decoder that reconstructs a speaker’s face from a still frame, regardless of its lighting or pose. This digital reconstruction was then compared to synthesized portraits solely from a speaker’s voice, with striking results that you can see in these images.

The synthesized faces are generic, meaning the produced images are not of specific individuals as those produced by the decoder. Nevertheless, they still manage to capture the basic facial features of a speaker such as skin color, gender, and age. The longer the voice recordings were, the more accurate the synthesized portrait proved to be.

But there were also plenty of mismatches. High-pitched voices were often identified as female, even in cases when they came from males, such as young boys. Asian men who spoke in American English had portraits resembling white males, but this did not happen when the Asian voice spoke in Chinese.

The AI may remind some of their racist uncle, and the researchers are aware of these biases and are looking to overcome these limitations. Improving the system’s accuracy is a matter of providing more training data that is representative of the general population.

Until these limitations are addressed, real-world applications of this AI system should be treated with care. One possible use could be to make interactions between humans and machines more appealing. Machine-generated voices used by home devices and virtual assistants could now be given an appropriate face. Law enforcement could use this neural network to generate a portrait of a suspect when the only evidence is a voice recording. However, any government use will be sure to be met with criticism surrounding privacy and ethics.

“We believe that generating faces, as opposed to predicting specific attributes, may provide a more comprehensive view of voice face correlations and can open up new research opportunities and applications,” the MIT authors wrote in the description of their project called Speech2Face.

[via PetaPixel]

share Share

This Rare Viking Burial of a Woman and Her Dog Shows That Grief and Love Haven’t Changed in a Thousand Years

The power of loyalty, in this life and the next.

This EV Battery Charges in 18 Seconds and It’s Already Street Legal

RML’s VarEVolt battery is blazing a trail for ultra-fast EV charging and hypercar performance.

DARPA Just Beamed Power Over 5 Miles Using Lasers and Used It To Make Popcorn

A record-breaking laser beam could redefine how we send power to the world's hardest places.

Why Do Some Birds Sing More at Dawn? It's More About Social Behavior Than The Environment

Study suggests birdsong patterns are driven more by social needs than acoustics.

Nonproducing Oil Wells May Be Emitting 7 Times More Methane Than We Thought

A study measured methane flow from more than 450 nonproducing wells across Canada, but thousands more remain unevaluated.

CAR T Breakthrough Therapy Doubles Survival Time for Deadly Stomach Cancer

Scientists finally figured out a way to take CAR-T cell therapy beyond blood.

The Sun Will Annihilate Earth in 5 Billion Years But Life Could Move to Jupiter's Icy Moon Europa

When the Sun turns into a Red Giant, Europa could be life's final hope in the solar system.

Ancient Roman ‘Fast Food’ Joint Served Fried Wild Songbirds to the Masses

Archaeologists uncover thrush bones in a Roman taberna, challenging elite-only food myths

A Man Lost His Voice to ALS. A Brain Implant Helped Him Sing Again

It's a stunning breakthrough for neuroprosthetics

This Plastic Dissolves in Seawater and Leaves Behind Zero Microplastics

Japanese scientists unveil a material that dissolves in hours in contact with salt, leaving no trace behind.