Missing the forest for the trees
For the second experiment, the team showed images of glass figurines to VGG-19 and a second deep learning network called AlexNet. Both networks were trained to recognize objects using a database called ImageNet. While VGG-19 performed better than AlexNet, they were still both pretty terrible. Neither network could correctly identify the figurines as their first choice: an elephant figurine, for example, was ranked with almost a 0% chance of being an elephant by both networks. On average, AlexNet ranked the correct answer 328th out of 1,000 choices.
In this experiment, too, the networks’ first choices were pretty puzzling: VGG-19, for example, chose “website” for a goose figure and “can opener” for a polar bear.
“The machines make very different errors from humans,” said co-author Hongjing Lu, a UCLA professor of psychology. “Their learning mechanisms are much less sophisticated than the human mind.”
“We can fool these artificial systems pretty easily.”
For the third and fourth experiment, the team focused on contours. First, they showed the networks 40 drawings outlined in black, with the images in white. Again, the machine did a pretty poor job of identifying common items (such as bananas or butterflies). In the fourth experiment, the researchers showed both networks 40 images, this time in solid black. Here, the networks did somewhat better — they listed the correct object among their top five choices around 50% of the time. They identified some items with good confidence (99.99% chance for an abacus and 61% chance for a cannon from VGG-19, for example) while they simply dropped the ball on others (both networks listed a white hammer outlined in black for under 1% chance of being a hammer).
Still, it’s undeniable that both algorithms performed better during this step than any other before them. Kellman says this is likely because the images here lacked “internal contours” — edges that confuse the programs.
Throwing a wrench in
Now, in experiment five, the team actually tried to throw the machine off their game as much as possible. They worked with six images that VGG-16 identified correctly in the previous steps, scrambling them to make them harder to recognize while preserving some pieces of the objects shown. They also employed a group of ten UCLA undergrads as a control group.
The students were shown objects in black silhouettes — some scrambled to be difficult to recognize and some unscrambled, some objects for just one second, and some for as long as the students wanted to view them. Students correctly identified 92% of the unscrambled objects and 23% of the scrambled ones when allowed a single second to view them. When the students could see the silhouettes for as long as they wanted, they correctly identified 97% of the unscrambled objects and 37% of the scrambled objects.
VGG-19 correctly identified five of these six images (and was quite close on the sixth, too, the team writes). The team says humans probably had more trouble identifying the images than the machine because we observe the entire object when trying to determine what we’re seeing. Artificial intelligence, in contrast, works by identifying fragments.
“This study shows these systems get the right answer in the images they were trained on without considering shape,” Kellman said. “For humans, overall shape is primary for object recognition, and identifying images by overall shape doesn’t seem to be in these deep learning systems at all.”
The results suggest that right now, AI (as we know and program it) is simply too immature to actually face the real world. It’s easily duped, and it works differently than us — so it’s hard to intuit how it will behave. Still, understanding how such networks ‘see’ the world around them would be very helpful as we move forward with them, the team explains. If we know their weaknesses, we know where we need to put most work in to make meaningful strides.
The paper “Deep convolutional networks do not classify based on global object shape” has been published in the journal PLOS Computational Biology.