Multilingual resources for entity extraction

Stephanie Strassel, Alexis Mitchell, Shudong Huang
2003 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition -  
Progress in human language technology requires increasing amounts of data and annotation in a growing variety of languages. Research in Named Entity extraction is no exception. Linguistic Data Consortium is creating annotated corpora to support information extraction in English, Chinese, Arabic, and other languages for a variety of US Governmentsponsored programs. This paper covers the scope of annotation and research tasks within these programs, describes some of the challenges of multilingual
more » ... corpus development for entity extraction, and concludes with a description of the corpora developed to support this research.
doi:10.3115/1119384.1119391 dblp:conf/acl/StrasselM03 fatcat:6nbkqm755vgw7pkojgvj2ea2c4