A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is
The generation of the textual description of the differences in images is a relatively new concept that requires the fusion of both computer vision and natural language techniques. In this paper, we present a novel Fully Convolutional CaptionNet (FCC) that employs an encoder-decoder framework to perform visual feature extractions, compute the feature distances, and generate new sentences describing the measured distances. After extracting the features of the images, a contrastive function isdoi:10.1109/access.2019.2957513 fatcat:4t7nsc62tze5hjq56rmvhruzsm