Filters








6,895 Hits in 4.6 sec

Automatic extension of non-english wordnets

Katja Hofmann, Erik Tjong Kim Sang
2007 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '07  
In this paper we apply the method of [4] to a non-English language (Dutch) for which only a basic WordNet and a parser are available.  ...  Problems in transferring the method to another language include the small size of non-English WordNets and the lack of sense-tagged corpora in other languages.  ... 
doi:10.1145/1277741.1277932 dblp:conf/sigir/HofmannS07 fatcat:4cwvwqoxoff4jbpfs3xqpnpdnu

Polylingual Wordnet [article]

Mihael Arcan and John McCrae and Paul Buitelaar
2019 arXiv   pre-print
Princeton WordNet is one of the most important resources for natural language processing, but is only available for English.  ...  Therefore it would be beneficial to have a high-quality automatic translation approach that would support NLP techniques, which rely on WordNet in new languages.  ...  We evaluate the quality of translations of the WordNet entries against the existing entries in the non-English Wordnets (Table 3 ).  ... 
arXiv:1903.01411v1 fatcat:k7rilkthk5gs3gpokv7avy4ay4

Improving Wordnets for Under-Resourced Languages Using Machine Translation information

Bharathi Raja Chakravarthi, Mihael Arcan, John P. McCrae
2018 Zenodo  
Wordnets are extensively used in natural language processing, but the current approaches for manually building a wordnet from scratch involves large research groups for a long period of time, which are  ...  We report evaluation results of the generated wordnet senses in term of precision for these languages.  ...  In general, creating NLP systems requires an extensive amount of resources and manual effort, however, under-resourced languages lack in both.  ... 
doi:10.5281/zenodo.2599952 fatcat:a6nal5jxsvgjhcfpg4p52r622i

Enlarging the Croatian WordNet with WN-Toolkit and Cro-Deriv

Antoni Oliver, Kresimir Sojat, Matea Srebacic
2015 Recent Advances in Natural Language Processing  
Comparing these figures with the size of the Princeton WordNet for English version 3.0, that has 117,659 synsets and 206,975 synset-variant pairs, it is clear that the CroWN should be expanded.  ...  After this first expansion, CroWN reached 70.63% of the core wordnet.  ...  The automatic translation of sense-tagged corpora have been performed thanks to an academic agreement with Google .  ... 
dblp:conf/ranlp/OliverSS15 fatcat:6upzey6gife3bkyhmwsvbnyua4

Further expansion of the Croatian WordNet

Kresimir Sojat, Matea Filko, Antoni Oliver
2018 Global WordNet Conference  
In this paper a semi-automatic procedure for the expansion of the Croatian Wordnet (CroWN) is presented.  ...  The precision values of the automatic process is low (about 30%), but the results proved valuable for the enlargment of CroWN.  ...  Acknowledgments This research has been carried thanks to the project TUNER TIN2015-65308-C5-1-R (MINECO/FEDER, UE) and the short-term research support of the University of Zagreb.  ... 
dblp:conf/wordnet/SojatFO18 fatcat:56spmkbl7bhvjgdnpkezxyfy7u

Constructing a poor man's wordnet in a resource-rich world

Darja Fišer, Benoît Sagot
2015 Language Resources and Evaluation  
Automatic, manual and task-based evaluations show that the resulting resource, the latest version of the Slovene wordnet, is already a valuable source of lexico-semantic information.  ...  In this paper we present a language-independent, fully modular and automatic approach to bootstrap a wordnet for a new language by recycling different types of already existing language resources, such  ...  Acknowledgments The work described in this paper was funded in part by the French-Slovene PHC PROTEUS project 22718UC ''Building Slovene-French linguistic resources: parallel corpus and wordnet' ' (2010  ... 
doi:10.1007/s10579-015-9295-6 fatcat:som24c2hn5bhzpgylpdbs2rrrq

The data-driven Bulgarian WordNet: BTBWN

Petya Osenova, Kiril Simov
2018 Cognitive Studies | Études cognitives  
The mapping between the two WordNets (English and Bulgarian) is a basis for applications such as machine translation and multilingual information retrieval.  ...  The data-driven Bulgarian WordNet: BTBWNThe paper presents our work towards the simultaneous creation of a data-driven WordNet for Bulgarian and a manually annotated treebank with semantic information.  ...  Acknowledgment This research has received partial support by the grant 02/12 Deep Models of Semantic Knowledge (DemoSem), funded by the Bulgarian National Science Fund in 2017-2019.  ... 
doi:10.11649/cs.1713 fatcat:nb5fj4ifkrgmboy6lmuxdecaye

How Stable are WordNet Synsets?

Eric Kafe
2017 International Conference on Language, Data, and Knowledge  
Many of them are very large and provide extensive coverage of vocabulary. All of them are somehow mapped to different versions of Princeton WordNet.  ...  BulNet (Bulgarian), Czech Wordnet, DanNet, Dutch Open Source WordNet, enWordNet (English, a significant extension to Princeton WordNet), Estonian WordNet, Finish WordNet, GermaNet, MultiWordNet, plWordNet  ...  •- Some problems identified by NTU-MC • Some solutions with the especially MWEs and cultural words non-english senses: gohan "cooked rice" kome "rice grains" want to add multiple languages at once small  ... 
dblp:conf/ldk/Kafe17 fatcat:bcjvjlpjnrcabc67jvxqgaopne

Expanding wordnets to new languages with multilingual sense disambiguation

Mihael Arcan, John Philip McCrae, Paul Buitelaar
2016 International Conference on Computational Linguistics  
Princeton WordNet is one of the most important resources for natural language processing, but is only available for English.  ...  Therefore it would be beneficial to have a high-quality automatic translation approach that would support NLP techniques, which rely on WordNet in new languages.  ...  automatic translations of English WordNet entries.  ... 
dblp:conf/coling/ArcanMB16 fatcat:jf22rfjymrakjkrpe5skxs4uce

Creating the Open Wordnet Bahasa

Nurril Hirfana Bte Mohamed Noor, Suerya Sapuan, Francis Bond
2011 Pacific Asia Conference on Language, Information and Computation  
It is created by combining information from several lexical resources: the French-English-Malay dictionary FEM, the KAmus Melayu-Inggeris KAMI, and wordnets for English, French and Chinese.  ...  Construction went through three steps: (i) automatic building of word candidates; (ii) evaluation and selection of acceptable candidates from merging of lexicons; (iii) final hand check of the 5,000 core  ...  CoreNet is an extension of Goi-Taikei to Chinese and Korean. These consist of a table matching CoreNet classes to one or more wordnet synsets.  ... 
dblp:conf/paclic/NoorSB11 fatcat:4h6pkhrzt5eu3ouaj2ddhg5lsq

A Multilingual Lexico-Semantic Database and Ontology [chapter]

Francis Bond, Christiane Fellbaum, Shu-Kai Hsieh, Chu-Ren Huang, Adam Pease, Piek Vossen
2014 Towards the Multilingual Semantic Web  
It is made by exploiting links from various monolingual wordnets to the English Wordnet. Currently, it contains 118,337 concepts expressed in 1,643,260 senses in 22 languages.  ...  First we describe the Open Multilingual Wordnet, a multilingual wordnet with twenty two languages and a rich structure of semantic relations.  ...  The more expressive the representation, and the more extensive the set of formalizations for each concept, the more things that can be checked automatically.  ... 
doi:10.1007/978-3-662-43585-4_15 fatcat:5c33b2vkfbgtrdky6y3fq4ljbi

Studying Taxonomy Enrichment on Diachronic WordNet Versions [article]

Irina Nikishina, Alexander Panchenko, Varvara Logacheva, Natalia Loukachevitch
2020 arXiv   pre-print
We explore the possibilities of taxonomy extension in a resource-poor setting and present methods which are applicable to a large number of languages.  ...  We create novel English and Russian datasets for training and evaluating taxonomy enrichment models and describe a technique of creating such datasets for other languages.  ...  We thank Yuriy Nazarov and David Dale for running their approaches from RUSSE'2020 shared task on the English datasets.  ... 
arXiv:2011.11536v1 fatcat:wz5vq2e7u5bndjk42b5qj7umlu

Dictionary Alignment for Context-sensitive Word Glossing

Willy Yap, Timothy Baldwin
2007 Australasian Language Technology Association Workshop  
The basis of the proposed method is sentence similarity of the sense definition sentences, using a bilingual Japanese-to-English dictionary as a pivot during the alignment process.  ...  This paper proposes a method for automatically sense-to-sense aligning dictionaries in different languages (focusing on Japanese and English), based on structural data in the respective dictionaries.  ...  Asanoma (2001) aligned the Japanese Goi-Taikei ontology with WordNet by first translating a significant subset of the WordNet synonym sets (synsets) into Japanese, automatically matching these based on  ... 
dblp:conf/acl-alta/YapB07 fatcat:ffrhglayazcb5hnhpfaeayudci

XL-WSD: An Extra-Large and Cross-Lingual Evaluation Framework for Word Sense Disambiguation

Tommaso Pasini, Alessandro Raganato, Roberto Navigli
2021 Zenodo  
The fast development of new approaches has been further encouraged by a well-framed evaluation suite for English, which has allowed their performances to be kept track of and compared fairly.  ...  We leverage XL-WSD datasets to conduct an extensive evaluation of neural and knowledge-based approaches, including the most recent multilingual language models.  ...  Acknowledgments The authors gratefully acknowledge the support of the ERC Consolidator Grants MOUSSE No. 726487, and FoTran No. 771113 and the ELEXIS project No. 731015 under the European Union's Horizon  ... 
doi:10.5281/zenodo.5543386 fatcat:nuc7warvmfasnjqbcnrixclbxm

Challenges Behind the Data-driven Bulgarian WordNet (BulTreeBank Bulgarian Wordnet)

Petya Osenova, Kiril Ivanov Simov
2017 International Conference on Language, Data, and Knowledge  
The mapping between the two WordNets (English and Bulgarian) is a basis for applications such as machine translation and multilingual information retrieval.  ...  The paper presents our work towards the simultaneous creation of a data-driven WordNet for Bulgarian and a manually annotated treebank with semantic information.  ...  Acknowledgements This research has received partial support by the grant 02/12 -Deep Models of Semantic Knowledge (DemoSem), funded by the Bulgarian National Science Fund in 2017-2019.  ... 
dblp:conf/ldk/OsenovaS17 fatcat:algvqt5eencdxp7gdxt7hpgcgy
« Previous Showing results 1 — 15 out of 6,895 results