ZME Science
No Result
View All Result
ZME Science
No Result
View All Result
ZME Science

Home → Science → News

New AI program creates realistic ‘talking heads’ from only an image and an audio

Anyone can now speak like Obama -- digitally.

Mihai AndreibyMihai Andrei
November 24, 2023
in Future, News, Technology
A A
Edited and reviewed by Tibi Puiu
Share on FacebookShare on TwitterSubmit to Reddit
digitalizing facial emotions
Image generated by AI (not in the study).

The landscape of generative AI is ever-evolving — and in the past year, it’s really taken off. Seemingly overnight, we have AIs that can generate images or text with stunning ease. This new achievement ties right into that, taking it one step further. A team of researchers led by Assoc Prof Lu Shijian from the Nanyang Technological University (NTU) in Singapore has developed a computer program that creates realistic videos, reflecting the facial expressions and head movements of the person speaking

This concept, known as audio-driven talking face generation, has gained significant traction in both academic and industrial realms due to its vast potential applications in digital human visual dubbing, virtual reality, and beyond. The core challenge lies in creating facial animations that are not just technically accurate but also convey the subtle nuances of human expressions and head movements in sync with the spoken audio.

The problem is that humans have a lot of different facial movements and emotions, and capturing the entire spectrum is extremely difficult. But the new method seems to capture everything, including accurate lip movements, vivid facial expressions, and natural head poses – all derived from the same audio input.

Diverse yet realistic facial animations

A DIRFA-generated ‘talking head’ with just an audio of former US president Barrack Obama speaking, and a photo of Associate Professor Lu Shijian. Credit: Nanyang Technological University

The research paper in focus introduces DIRFA (Diverse yet Realistic Facial Animations). The team trained DIRFA on more than 1 million clips from 6,000 people generated with an open-source database. The engine doesn’t only focus on lip-syncing — it attempts to derive the entire range of facial movements and reactions.

First author Dr. Wu Rongliang, a Ph.D. graduate from NTU’s SCSE, said:

“Speech exhibits a multitude of variations. Individuals pronounce the same words differently in diverse contexts, encompassing variations in duration, amplitude, tone, and more. Furthermore, beyond its linguistic content, speech conveys rich information about the speaker’s emotional state and identity factors such as gender, age, ethnicity, and even personality traits.

Then, after being trained, DIRFA takes in a static portrait of a person and the audio and produces a 3D video showing the person speaking. It’s not perfectly smooth, but it is consistent in the facial animations.

RelatedPosts

Doctors Restored Hearing in Children and Adults With a Single Shot
Hubble captures dramatic outburst of space “volcano”
Graphene transistors made using DNA assembly
Eighteen U.S states are taking the EPA to court over weakening emission regulations

“Our program also builds on previous studies and represents an advancement in the technology, as videos created with our program are complete with accurate lip movements, vivid facial expressions and natural head poses, using only their audio recordings and static images,” says Corresponding author Associate Professor Lu Shijian.

Why this matters

Far from being only a cool party trick (and potentially being used for disinformation by malicious actors), this technology has important and positive applications.

In healthcare, it promises to enhance the capabilities of virtual assistants and chatbots, making digital interactions more engaging and empathetic. More profoundly, it could serve as a transformative tool for individuals with speech or facial disabilities, offering them a new avenue to communicate their thoughts and emotions through expressive digital avatars.

While DIRFA opens up exciting possibilities, it also raises important ethical questions, particularly in the context of misinformation and digital authenticity. Addressing these concerns, the NTU team suggests incorporating safeguards like watermarks to indicate the synthetic nature of the videos — but if there’s anything the internet has taught us, is that there are ways around such safeguards.

It’s still early days for all AI technology. The potential for important societal impact is there, but so is the risk of misuse. As always, we should ensure that the digital world we are creating is safe, authentic, and beneficial for all.

The study was published in the journal Pattern Recognition.

ShareTweetShare
Mihai Andrei

Mihai Andrei

Dr. Andrei Mihai is a geophysicist and founder of ZME Science. He has a Ph.D. in geophysics and archaeology and has completed courses from prestigious universities (with programs ranging from climate and astronomy to chemistry and geology). He is passionate about making research more accessible to everyone and communicating news and features to a broad audience.

Related Posts

Future

You Can Now Buy a Humanoid Robot for Under $6,000 – Here’s What It Can Do

byKartikeya Walia
4 hours ago
Economics

Volkswagen Wants You to Pay a Subscription to Access All the Car Features

byMihai Andrei
4 hours ago
News

The disturbing reason why Japan’s Olympic athletes wear outfits designed to block infrared

byMihai Andrei
14 hours ago
Erin Kunz holds a microelectrode array in the Clark Center, Stanford University, on Thursday, August 8, 2025, in Stanford, Calif. The array is implanted in the brain to collect data. (Photo by Jim Gensheimer)
Future

Brain Implant Translates Silent Inner Speech into Words, But Critics Raise Fears of Mind Reading Without Consent

byTibi Puiu
15 hours ago

Recent news

You Can Now Buy a Humanoid Robot for Under $6,000 – Here’s What It Can Do

August 19, 2025

Volkswagen Wants You to Pay a Subscription to Access All the Car Features

August 19, 2025

The disturbing reason why Japan’s Olympic athletes wear outfits designed to block infrared

August 19, 2025
  • About
  • Advertise
  • Editorial Policy
  • Privacy Policy and Terms of Use
  • How we review products
  • Contact

© 2007-2025 ZME Science - Not exactly rocket science. All Rights Reserved.

No Result
View All Result
  • Science News
  • Environment
  • Health
  • Space
  • Future
  • Features
    • Natural Sciences
    • Physics
      • Matter and Energy
      • Quantum Mechanics
      • Thermodynamics
    • Chemistry
      • Periodic Table
      • Applied Chemistry
      • Materials
      • Physical Chemistry
    • Biology
      • Anatomy
      • Biochemistry
      • Ecology
      • Genetics
      • Microbiology
      • Plants and Fungi
    • Geology and Paleontology
      • Planet Earth
      • Earth Dynamics
      • Rocks and Minerals
      • Volcanoes
      • Dinosaurs
      • Fossils
    • Animals
      • Mammals
      • Birds
      • Fish
      • Amphibians
      • Reptiles
      • Invertebrates
      • Pets
      • Conservation
      • Animal facts
    • Climate and Weather
      • Climate change
      • Weather and atmosphere
    • Health
      • Drugs
      • Diseases and Conditions
      • Human Body
      • Mind and Brain
      • Food and Nutrition
      • Wellness
    • History and Humanities
      • Anthropology
      • Archaeology
      • History
      • Economics
      • People
      • Sociology
    • Space & Astronomy
      • The Solar System
      • Sun
      • The Moon
      • Planets
      • Asteroids, meteors & comets
      • Astronomy
      • Astrophysics
      • Cosmology
      • Exoplanets & Alien Life
      • Spaceflight and Exploration
    • Technology
      • Computer Science & IT
      • Engineering
      • Inventions
      • Sustainability
      • Renewable Energy
      • Green Living
    • Culture
    • Resources
  • Videos
  • Reviews
  • About Us
    • About
    • The Team
    • Advertise
    • Contribute
    • Editorial policy
    • Privacy Policy
    • Contact

© 2007-2025 ZME Science - Not exactly rocket science. All Rights Reserved.