Partial-input baselines show that NLI models can ignore context, but they don't

Neha Srikanth, Rachel Rudinger
2022 Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies   unpublished
When strong partial-input baselines reveal artifacts in crowdsourced NLI datasets, the performance of full-input models trained on such datasets is often dismissed as reliance on spurious correlations. We investigate whether stateof-the-art NLI models are capable of overriding default inferences made by a partial-input baseline. We introduce an evaluation set of 600 examples consisting of perturbed premises to examine a RoBERTa model's sensitivity to edited contexts. Our results indicate that
more » ... I models are still capable of learning to condition on context-a necessary component of inferential reasoning-despite being trained on artifact-ridden datasets.
doi:10.18653/v1/2022.naacl-main.350 fatcat:yzpvhj2dqva4bbgew75xwfq7k4