Automatic Adaptation of WordNet to Sublanguages and to Computational Tasks

Roberto Basili, Alessandro Cucchiarelli, Carlo Consoli, Maria Teresa Pazienza, Paola Velardi
1998 International Conference on Computational Linguistics  
Semantically tagging a corpus is useful for many intermediate NLP tasks such as: acquisition of word argument structures in sublanguages, acquisition of syntactic disambiguation cues, terminology learning, etc. Semantic categories allow the generalization of observed word patterns, and facilitate the discovery of irecurrent sublanguage phenomena and selectional rules of various types. Yet, as opposed to POS tags in morphology, there is no consensus in literature about the type and granularity
more » ... the category inventory. In addition, most available on-line taxonomies, as WordNet, are over ambiguous and, at the same time, may not include many domain-dependent senses of words. In this paper we describe a method to adapt a general purpose taxonomy to an application sub[anguage: flint, we prune branches of the Wordnet hierarchy that are too " fine grained" for the domain: then. a statistical model of classes is built from corpus contexts to sort the different classifications or assign a classification to known and unknown words, respectively.
dblp:conf/coling/BasiliCCPV98 fatcat:mgoj45betnhfrdqqt6kbsxvumy