1 Hit in 2.2 sec

VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers [article]

Estelle Aflalo, Meng Du, Shao-Yen Tseng, Yongfei Liu, Chenfei Wu, Nan Duan, Vasudev Lal
2022 arXiv   pre-print
VL-InterpreT is a task agnostic and integrated tool that (1) tracks a variety of statistics in attention heads throughout all layers for both vision and language components, (2) visualizes cross-modal  ...  To contribute to this quest, we propose VL-InterpreT, which provides novel interactive visualizations for interpreting the attentions and hidden representations in multimodal transformers.  ...  Conclusions and Future Directions In this paper we presented VL-InterpreT, an interactive visualization tool for interpreting vision-language transformers.  ... 
arXiv:2203.17247v2 fatcat:mcnvgrevcnf6plnyvikhjlx2ee