A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
BabyTalk: Understanding and Generating Simple Image Descriptions
2013
IEEE Transactions on Pattern Analysis and Machine Intelligence
We present a system to automatically generate natural language descriptions from images. This system consists of two parts. The first part, content planning, smooths the output of computer vision-based detection and recognition algorithms with statistics mined from large pools of visually descriptive text to determine the best content words to use to describe an image. The second step, surface realization, chooses words to construct natural language sentences based on the predicted content and
doi:10.1109/tpami.2012.162
pmid:22848128
fatcat:qhye4obzpbcllos2dr6rohli2u