A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is
We introduce the new Birds-to-Words dataset of 41k sentences describing fine-grained differences between photographs of birds. The language collected is highly detailed, while remaining understandable to the everyday observer (e.g., "heart-shaped face," "squat body"). Paragraph-length descriptions naturally adapt to varying levels of taxonomic and visual distance-drawn from a novel stratified sampling approach-with the appropriate level of detail. We propose a new model called Neural Naturalistdoi:10.18653/v1/d19-1065 dblp:conf/emnlp/ForbesKSB19 fatcat:cggpdza7avfz7eto6khheqtd6e