Filters








4 Hits in 2.4 sec

WEmbSim: A Simple yet Effective Metric for Image Captioning

Naeha Sharif, Lyndon White, Mohammed Bennamoun, Wei Liu, Syed Afaq Ali Shah
2020 2020 Digital Image Computing: Techniques and Applications (DICTA)  
Therefore, we believe that WEmbSim sets a new baseline for any complex metric to be justified.  ...  Based on our past attempts at developing highly sophisticated learning-based metrics, we have discovered that a simple cosine similarity measure using the Mean of Word Embeddings (MOWE) of captions can  ...  Our contributions in this work are as follows: • We propose a simple yet effective unsupervised metric WEmbSim for automatic caption evaluation. • To demonstrate the strong performance of WEmbSim compared  ... 
doi:10.1109/dicta51227.2020.9363392 fatcat:bdbpf3xwhbcq3df7yvjdkmfp2e

WEmbSim: A Simple yet Effective Metric for Image Captioning [article]

Naeha Sharif, Lyndon White, Mohammed Bennamoun, Wei Liu, Syed Afaq Ali Shah
2020 arXiv   pre-print
Therefore, we believe that WEmbSim sets a new baseline for any complex metric to be justified.  ...  Based on our past attempts at developing highly sophisticated learning-based metrics, we have discovered that a simple cosine similarity measure using the Mean of Word Embeddings(MOWE) of captions can  ...  Our contributions in this work are as follows: • We propose a simple yet effective unsupervised metric WEmbSim for automatic caption evaluation. • To demonstrate the strong performance of WEmbSim compared  ... 
arXiv:2012.13137v1 fatcat:gw25ymwapnhm5p5dhrl5x76roi

Goal-driven text descriptions for images [article]

Ruotian Luo
2021 arXiv   pre-print
In Chapter 3, we focus on generating the referring expression, a text description for an object in the image so that a receiver can infer which object is being described.  ...  In Chapter 4, we introduce a method that encourages discriminability in image caption generation. We show that more discriminative captioning models generate more descriptive captions.  ...  With the same goal, [226] proposed a simple yet effective method: reweighing the importance of ground truth captions to encourage the discriminability of generated captions. 48 Models Our model  ... 
arXiv:2108.12575v1 fatcat:xxphnu354zc7tfl6odertd5kxm

Natural Language Description of Images [article]

Naeha Sharif
2020
MSCOCO is one of the most popular datasets amongst image captioning research community, which contains 123,287 images, each paired with at least 5 captions.  ...  To maintain consistency with literature, we use the publicly available split of MSCOCO which provides 5,000 images for testing and validation each.  ...  WEmbSim: A Simple yet Effective Metric for Image Captioning The area of automatic image caption evaluation is still undergoing intensive research to address the needs of generating captions which can meet  ... 
doi:10.26182/nskk-bg13 fatcat:63ous52jfvcenapt4tfomewxw4