Whos In the Picture

Tamara L. Berg, Alexander C. Berg, Jaety Edwards, David A. Forsyth
2004 Neural Information Processing Systems  
The context in which a name appears in a caption provides powerful cues as to who is depicted in the associated image. We obtain 44,773 face images, using a face detector, from approximately half a million captioned news images and automatically link names, obtained using a named entity recognizer, with these faces. A simple clustering method can produce fair results. We improve these results significantly by combining the clustering process with a model of the probability that an individual is
more » ... depicted given its context. Once the labeling procedure is over, we have an accurately labeled set of faces, an appearance model for each individual depicted, and a natural language model that can produce accurate results on captions in isolation.
dblp:conf/nips/BergBEF04 fatcat:tlc3r3bqofesnntsgjqu5mempq