A Corpus and Model Integrating Multiword Expressions and Supersenses

Nathan Schneider, Noah A. Smith
This paper introduces a task of identifying and semantically classifying lexical expressions in running text. We investigate the online reviews genre, adding semantic supersense annotations to a 55,000 word English corpus that was previously annotated for multiword expressions. The noun and verb supersenses apply to full lexical expressions, whether single- or multiword. We then present a sequence tagging model that jointly infers lexical expressions and their supersenses. Results show that
more » ... with our relatively small training corpus in a noisy domain, the joint task can be performed to attain 70% class labeling F1.
doi:10.1184/r1/6472955.v1 fatcat:c56bwtgpxjfsvcimpniktmw56m