homehome Home chatchat Notifications


MIT makes an AI that can predict what the future looks and sounds like

Artificial intelligence is learning in seconds what took humans a lifetime to master.

Tibi Puiu
June 21, 2016 @ 6:48 pm

share Share

As humans, we’re pretty good at anticipating things, but that’s much harder for robots. If you walk down the street and see two people meeting in front of a café, you know they’ll shake hands or even hug a few seconds later, depending on how close their relationship is. We, humans, sense this sort of stuff very easily, and now some robots can too.

Left: still frame given to the algorithm which had to predict what happened next. Right: next frames from the video. Credit: MIT CSAIL

Left: still frame given to the algorithm which had to predict what happened next. Right: next frames from the video. Credit: MIT CSAIL

MIT used deep-learning algorithms — neural networks that teach computers to find patterns by themselves from an ocean of data — to build an artificial intelligence which can predict what action will occur next, starting from nothing but a still frame. Each of these networks was programmed to classify an action as either a hug, handshake, high-five, or kiss. These networks are then merged to predict what happens next. The demonstration video below is telling.

To train the algorithm, the researchers fed more than 600 hours of video footage from YouTube but also the complete series of “The Office” and “Desperate Housewives.” When this was tested, it could correctly predict 43 percent of the time what action would happen in the next one second.

In a second test, the algorithm was shown a frame from a video and asked to predict what object would appear five seconds later. For instance, a frame might show a person attempting to open a microwave. Inside might be some leftover pasta from last night, a coffee mug and so on.

The algorithm correctly predicted what object appeared 11 percent of the time. That might not sound impressive, but it’s still 30 percent better than the baseline measurement and some of its guesses were better than those made by some people. When the same test was given to people, the subjects were right only 71 percent of the time.

“There’s a lot of subtlety to understanding and forecasting human interactions,” says Carl Vondrick, a PhD student at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL). “We hope to be able to work off of this example to be able to soon predict even more complex tasks.”

“I’m excited to see how much better the algorithms get if we can feed them a lifetime’s worth of videos,” Vondrick added. “We might see some significant improvements that would get us closer to using predictive-vision in real-world situations.”

For now, there’s no practical outcome in mind for this deep-learning algorithm, but the researchers say their present work might one day lead to better household robots that can work better with humans by vanquishing some of the inherent awkwardness.

Credit: MIT CSAIL

Credit: MIT CSAIL

From the same Computer Science and Artificial Intelligence Laboratory (CSAIL) lab came another predictive artificial intelligence — this time for sound. What it does is basically produce a sound when shown a video of an object being hit.

The assigned sounds are realistic enough to fool a human viewer. When asked whether a sound was real or fake, human participants picked the fake sound over the real one twice as often as a baseline algorithm.

“When you run your finger across a wine glass, the sound it makes reflects how much liquid is in it,” says CSAIL PhD student Andrew Owens, who was lead author of the paper describing MIT’s CSAIL sound algorithm. “An algorithm that simulates such sounds can reveal key information about objects’ shapes and material types, as well as the force and motion of their interactions with the world.”

In other words, both algorithms are now learning in seconds what most humans have to learn in a lifetime: how people react, social norms, even what sound a drum stick makes when it hits a puddle of water.

share Share

Ronan the Sea Lion Can Keep a Beat Better Than You Can — and She Might Just Change What We Know About Music and the Brain

A rescued sea lion is shaking up what scientists thought they knew about rhythm and the brain

Did the Ancient Egyptians Paint the Milky Way on Their Coffins?

Tomb art suggests the sky goddess Nut from ancient Egypt might reveal the oldest depiction of our galaxy.

Dinosaurs Were Doing Just Fine Before the Asteroid Hit

New research overturns the idea that dinosaurs were already dying out before the asteroid hit.

Denmark could become the first country to ban deepfakes

Denmark hopes to pass a law prohibiting publishing deepfakes without the subject's consent.

Archaeologists find 2,000-year-old Roman military sandals in Germany with nails for traction

To march legionaries across the vast Roman Empire, solid footwear was required.

Mexico Will Give U.S. More Water to Avert More Tariffs

Droughts due to climate change are making Mexico increasingly water indebted to the USA.

Chinese Student Got Rescued from Mount Fuji—Then Went Back for His Phone and Needed Saving Again

A student was saved two times in four days after ignoring warnings to stay off Mount Fuji.

The perfect pub crawl: mathematicians solve most efficient way to visit all 81,998 bars in South Korea

This is the longest pub crawl ever solved by scientists.

This Film Shaped Like Shark Skin Makes Planes More Aerodynamic and Saves Billions in Fuel

Mimicking shark skin may help aviation shed fuel—and carbon

China Just Made the World's Fastest Transistor and It Is Not Made of Silicon

The new transistor runs 40% faster and uses less power.