Annotating Coordination in the Penn Treebank

Wolfgang Maier, Sandra Kübler, Erhard W. Hinrichs, Julia Kriwanek
2012 Linguistic Annotation Workshop  
Finding coordinations provides useful information for many NLP endeavors. However, the task has not received much attention in the literature. A major reason for that is that the annotation of major treebanks does not reliably annotate coordination. This makes it virtually impossible to detect coordinations in which two conjuncts are separated by punctuation rather than by a coordinating conjunction. In this paper, we present an annotation scheme for the Penn Treebank which introduces a
more » ... ion between coordinating from non-coordinating punctuation. We discuss the general annotation guidelines as well as problematic cases. Eventually, we show that this additional annotation allows the retrieval of a considerable number of coordinate structures beyond the ones having a coordinating conjunction.
dblp:conf/acllaw/MaierKHK12 fatcat:ptx6kxryjfebfe2b5sbulngb2i