EUROWORDNET: A MULTILINGUAL DATABASE OF AUTONOMOUS AND LANGUAGE-SPECIFIC WORDNETS CONNECTED VIA AN INTER-LINGUALINDEX

P. Vossen
2004 International Journal of Lexicography  
This paper describes the multilingual design of the EuroWordNet database. The EuroWordNet database stores wordnets as autonomous language -specific structures that are interconnected via an Inter-Lingual-Index (ILI). In this paper, we discuss the possibilities to create mappings from each wordnet to the central ILI and how the ILI itself can be adapted to provide more overlap across the wordnets. We will argue that the ILI can be condensed to a more universal index of meaning, while the
more » ... , while the wordnets can still encode any fine-grained lexicalizations for each language. Introduction EuroWordNet 1 (Vossen 1998) was a 3-year project that developed a multilingual database with wordnets for 8 European languages: English, Dutch, Italian, Spanish, French, German, Czech and Estonian. Each wordnet is structured along the same lines as the Princeton WordNet (Fellbaum 1998). WordNet contains information about nouns, verbs, adjectives and adverbs in English and is organized around the notion of a synset. A synset is a set of words with the same part-of-speech that can be interchanged in a certain context. For example, {violin; fiddle} form a synset because they can be used to refer to the same concept, but {violin; violist; fiddler} represent another concept. It is thus clear that the same word can refer to multiple concepts (polysemy) and multiple words can point to the same concept (synonymy). Finally, synsets are related to each other by semantic relations, such as hyponymy (type-of relation between specific and more general concepts), meronymy (part-of relation between parts and wholes), etc. The wordnets in EuroWordNet are considered as autonomous language -specific ontologies. Each language has its own set of concepts based on the lexicalisation in that language. In addition, the wordnets are interconnected via a so-called Inter-Lingual-Index so that you can go from a synset in one language to the synsets in any of the other languages. The purpose of the Inter-Lingual-Index (ILI) is to provide an efficient mapping across the wordnet structures. Since each wordnet is a separate ontology, the ILI itself can be reduced to a condensed and universal index of meaning. We will argue that such an index is better to relate wordnets to each other and to
doi:10.1093/ijl/17.2.161 fatcat:vdqm43u4jzarpdqkw54ep4vsga