Can Visual Dialogue Models Do Scorekeeping? Exploring How Dialogue Representations Incrementally Encode Shared Knowledge

Brielen Madureira, David Schlangen
2022 Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)   unpublished
Cognitively plausible visual dialogue models should keep a mental scoreboard of shared established facts in the dialogue context. We propose a theory-based evaluation method for investigating to what degree models pretrained on the VisDial dataset incrementally build representations that appropriately do scorekeeping. Our conclusion is that the ability to make the distinction between shared and privately known statements along the dialogue is moderately present in the analysed models, but not
more » ... ways incrementally consistent, which may partially be due to the limited need for grounding interactions in the original task. Data Visual Dialogues and Encoders. We use the Vis-Dial dataset v.1.0 (Das et al., 2017a) and the three Q and A encoders (RL_DIV, SL and ICCV_RL)
doi:10.18653/v1/2022.acl-short.73 fatcat:r7tr3fbrlbaxlmbweusdaactra