A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Gorman and Bedrick (2019) argued for using random splits rather than standard splits in NLP experiments. ... In NLP, however, even worst-case splits, maximizing bias, often under-estimate the error observed on new samples of in-domain data, i.e., the data that models should minimally generalize to at test time ... The paper also benefited greatly from discussions with several of our colleagues at Google Research, including Slav Petrov and Sascha Rothe. ...doi:10.18653/v1/2021.eacl-main.156 fatcat:nj4tzek5m5h3xa67z7aekdpgie