What caricatures can teach us about facial recognition
A Our brains are incredibly agile machines, and it is hard to think of anything they do more efficiently than recognize faces. Just hours after birth, the eyes of newborns are drawn to facelike patterns. An adult brain knows it is seeing a face within 100 milliseconds, and it takes just over a second to realize that two different pictures of a face, even if they are lit or rotated in very different ways, belong to the same person.
B Perhaps the most vivid illustration of our gift for recognition is the magic of caricature-the fact that the sparest cartoon of a familiar face, even a single line dashed off in two seconds, can be identified by our brains in an instant. It is often said that a good caricature looks more like a person than the person themselves. As it happens, this notion, counterintuitive though it may sound, is actually supported by research. In the field of vision science, there is even a term for this seeming paradox-the caricature effect-a phrase that hints at how our brains misperceive faces as much as perceive them.
C Human faces are all built pretty much the same: two eyes above a nose that’s above a mouth, the features varying from person to person generally by mere millimetres. So what our brains look for, according to vision scientists, are the outlying features-those characteristics that deviate most from the ideal face we carry around in our heads, the running average of every "visage" we have ever seen. We code each new face we encounter not in absolute terms but in the several ways it differs markedly from the mean. In other words, we accentuate what is most important for recognition and largely ignore what is not. Our perception fixates on the upturned nose, the sunken eyes or the fleshy cheeks, making them loom larger. To better identify and remember people, we turn them into caricatures.
D Ten years ago, we all imagined that as soon as surveillance cameras had been equipped with the appropriate software, the face of a crime suspect would stand out in a crowd. Like a thumbprint, its unique features and configuration would offer a biometric key that could be immediately checked against any database of suspects. But now a decade has passed, and face-recognition systems still perform miserably in real-world conditions. Just recently, a couple who accidentally swapped passports at an airport in England sailed through electronic gates that were supposed to match their faces to file photos.
E All this leads to an interesting question. What if, to secure our airports and national landmarks, we need to learn more about caricature? After all, it's the skill of the caricaturist-the uncanny ability to quickly distill faces down to their most salient features-that our computers most desperately need to acquire. Clearly, better cameras and faster computers simply aren't going to be enough.
F At the University of Central Lancashire in England, Charlie Frowd, a senior lecturer in psychology, has used insights from caricature to develop a better police-composite generator. His system, called EvoFIT, produces animated caricatures, with each successive frame showing facial features that are more exaggerated than the last. Frowd's research supports the idea that we all store memories as caricatures, but with our own personal degree of amplification. So, as an animated composite depicts faces at varying stages of caricature, viewers respond to the stage that is most recognizable to them. In tests, Frowd's technique has increased positive identifications from as low as 3 percent to upwards of 30 percent.
G To achieve similar results in computer face recognition, scientists would need to model the artist’s genius even more closely-a feat that might seem impossible if you listen to some of the artists describe their nearly mystical acquisition of skills. Jason Seiler recounts how he trained his mind for years, beginning in middle school, until he gained what he regards as nothing less than a second sight. ‘A lot of people think that caricature is about picking out someone’s worst feature and exaggerating it as far as you can,' Seiler says. 'That’s wrong. Caricature is basically finding the truth. And then you push the truth.' Capturing a likeness, it seems, has less to do with the depiction of individual features than with their placement in relationship to one another. 'It's how the human brain recognizes a face. When the ratios between the features are correct, you see that face instantly.’
H Pawan Sinha. director of MIT's Sinha Laboratory for Vision Research, and one of the nation's most innovative computer-vision researchers, contends that these simple, exaggerated drawings can be objectively and systematically studied and that such work will lead to breakthroughs in our understanding of both human and machine-based vision. His lab at MIT is preparing to computationally analyze hundreds of caricatures this year, from dozens of different artists, with the hope of tapping their intuitive knowledge of what is and isn’t crucial for recognition. He has named this endeavor the Hirschfeld Project, after the famous New York Times caricaturist Al Hirschfeld.
I Quite simply, by analyzing sketches, Sinha lopes to pinpoint the recurring exaggerations in the caricatures that most strongly correlate to particular ways that the original faces deviate from the norm. The results, he believes, will ultimately produce a rank-ordered list of the 20 or so facial attributes that are most important for recognition: 'It’s a recipe for how to encode the face,' he says. In preliminary tests, the lab has already isolated important areas-for example, the ratio of the height of the forehead to the distance between the top of the nose and the mouth.
J On a given face, four of 20 such Hirschfeld attributes, as Sinha plans to call them, will be several standard deviations greater than the mean; on another face, a different handful of attributes might exceed the norm. But in all cases, it's the exaggerated areas of the face that hold the key. As matters stand today, an automated system must compare its target faces against the millions of continually altering faces it encounters. But so far, the software doesn't know what to look for amid this onslaught of variables. Armed with the Hirschfeld attributes, Sinha hopes that computers can be trained to focus on the features most salient for recognition, tuning out the others. ’Then.’ Sinha says, ’the sky is the limit’.