Filters








3,263 Hits in 5.6 sec

Learning multilingual named entity recognition from Wikipedia

Joel Nothman, Nicky Ringland, Will Radford, Tara Murphy, James R. Curran
2013 Artificial Intelligence  
We automatically create enormous, free and multilingual silver-standard training annotations for named entity recognition (ner) by exploiting the text and structure of Wikipedia.  ...  We first classify each Wikipedia article into named entity (ne) types, training and evaluating on 7200 manually-labelled Wikipedia articles across nine languages.  ...  Introduction Named entity recognition (ner) is the information extraction task of identifying and classifying mentions of people, organisations, locations and other named entities (nes) within text.  ... 
doi:10.1016/j.artint.2012.03.006 fatcat:7agjkau5wfhqbeyit3sddv2ggy

JRC-Names: A freely available, highly multilingual named entity resource [article]

Ralf Steinberger, Bruno Pouliquen, Mijail Kabadjov, Erik van der Goot
2013 arXiv   pre-print
These include improving name search in databases or on the internet, seeding machine learning systems to learn named entity recognition rules, improve machine translation results, and more.  ...  This paper describes a new, freely available, highly multilingual named entity resource for person and organisation names that has been compiled over seven years of large-scale multilingual news analysis  ...  Wentland et al. (2008) built a multilingual named entity dictionary by mining Wikipedia and exploiting various link types.  ... 
arXiv:1309.6162v1 fatcat:pcgozhfl4ndfzli3dymbmz7lkm

Multi-Multi-View Learning: Multilingual and Multi-Representation Entity Typing

Yadollah Yaghoobzadeh, Hinrich Schütze
2018 Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing  
For language, we consider high-resource and lowresource languages from Wikipedia.  ...  For representation, we consider representations based on the context distribution of the entity (i.e., on its embedding), on the entity's name (i.e., on its surface form) and on its description in Wikipedia  ...  We use these multilingual names and Wikipedias to build our representation views as described in §3.  ... 
doi:10.18653/v1/d18-1343 dblp:conf/emnlp/YaghoobzadehS18 fatcat:iz6le43puvb47feoswngprcjby

Multi-Multi-View Learning: Multilingual and Multi-Representation Entity Typing [article]

Yadollah Yaghoobzadeh, Hinrich Schütze
2018 arXiv   pre-print
For language, we consider high-resource and low-resource languages from Wikipedia.  ...  For representation, we consider representations based on the context distribution of the entity (i.e., on its embedding), on the entity's name (i.e., on its surface form) and on its description in Wikipedia  ...  We use these multilingual names and Wikipedias to build our representation views as described in §3.  ... 
arXiv:1810.10499v1 fatcat:u4koeg6km5fabms7yfx2qalj54

mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models [article]

Ryokan Ri, Ikuya Yamada, Yoshimasa Tsuruoka
2022 arXiv   pre-print
Recent studies have shown that multilingual pretrained language models can be effectively improved with cross-lingual alignment information from Wikipedia entities.  ...  We train a multilingual language model with 24 languages with entity representations and show the model consistently outperforms word-based pretrained models in various cross-lingual transfer tasks.  ...  Named Entity Recognition Named Entity Recognition (NER) is the task to detect entities in a sentence and classify their type.  ... 
arXiv:2110.08151v3 fatcat:h3onzzepznhlnk6xwmqzoosjy4

Improving Multilingual Named Entity Recognition with Wikipedia Entity Type Mapping

Jian Ni, Radu Florian
2016 Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing  
The state-of-the-art named entity recognition (NER) systems are statistical machine learning models that have strong generalization capability (i.e., can recognize unseen entities that do not appear in  ...  Central to our approach is the construction of high-accuracy, high-coverage multilingual Wikipedia entity type mappings.  ...  Acknowledgments We would like to thank Avirup Sil for helpful comments, and for collecting the Wikipedia data. We also thank the anonymous reviewers for their suggestions.  ... 
doi:10.18653/v1/d16-1135 dblp:conf/emnlp/NiF16 fatcat:z2fyglnnsndhdd7brduzc5ozq4

DAMO-NLP at SemEval-2022 Task 11: A Knowledge-based System for Multilingual Named Entity Recognition [article]

Xinyu Wang, Yongliang Shen, Jiong Cai, Tao Wang, Xiaobin Wang, Pengjun Xie, Fei Huang, Weiming Lu, Yueting Zhuang, Kewei Tu, Wei Lu, Yong Jiang
2022 arXiv   pre-print
The lack of contexts makes the recognition of ambiguous named entities challenging.  ...  To alleviate this issue, our team DAMO-NLP proposes a knowledge-based system, where we build a multilingual knowledge base based on Wikipedia to provide related context information to the named entity  ...  Named Entity Recognition Module In our system, we use XLM-R large as the embedding for all the tracks. It is a multilingual model and is applicable to all tracks.  ... 
arXiv:2203.00545v3 fatcat:gnrefrnys5cmlhld73kanfm5rm

LVBERT: Transformer-Based Model for Latvian Language Understanding

Arturs Znotins, Guntis Barzdins
2020 Human Language Technology - The Baltic Perspectiv  
We show that LVBERT improves the stateof-the-art for three Latvian NLP tasks including Part-of-Speech tagging, Named Entity Recognition and Universal Dependency parsing.  ...  Named Entity Recognition For training and evaluating NER, we used a recently published multi-layer text corpus for Latvian [14] .  ...  Wikipedia 1.3 25 Comments 5 80 News 20 380 Total 27 500 Table 2 . 2 Named entity dataset statistics Table 3 . 3 Performance of LVBERT on Latvian NLP tasks compared to multilingual BERT  ... 
doi:10.3233/faia200610 dblp:conf/hlt/ZnotinsB20 fatcat:zx2d3a72ifb6nbt5zdmwqxzvim

POLYGLOT-NER: Massive Multilingual Named Entity Recognition [article]

Rami Al-Rfou, Vivek Kulkarni, Bryan Perozzi, Steven Skiena
2014 arXiv   pre-print
We describe a system that builds Named Entity Recognition (NER) annotators for 40 major languages using Wikipedia and Freebase.  ...  Then, we automatically generate datasets from Wikipedia link structure and Freebase attributes.  ...  In this work, we perform a case study on how to build a massively multilingual Named Entity Recognition (NER) system.  ... 
arXiv:1410.3791v1 fatcat:kqkxgidkgzf2lp4iuh4twuicnm

Named Entity Disambiguation for German News Articles

Andreas Lommatzsch, Danuta Ploch, Ernesto William De Luca, Sahin Albayrak
2010 Lernen, Wissen, Daten, Analysen  
Unfortunately, each of these resources cover independently from each other insufficient information for the task of named entity disambiguation.  ...  We show that the intelligent filtering of context data and the combination of multilingual information provides high quality named entity disambiguation results.  ...  Due to the fact that we perform the named entity recognition based on DBpedia we focus on Wikipedia as data source.  ... 
dblp:conf/lwa/LommatzschPLA10 fatcat:5er76fburva5jbhjv5smdwyasm

The State of the Art on Knowledge Graph Construction from Text: Named Entity Recognition and Relation Extraction Perspectives

Jennifer D'Souza, Nandana Mihindukulasooriya
2022 Zenodo  
"Neural Architectures for Named Entity Recognition."  ...  in focus on the community toward language-independent named entity recognition Corpora 40 of 119 2002, 2003 2003, 2008 1995, 1998 Wikipedia-based NER 2002, 2003 2003, 2008 1995, 1998 Wikipedia-based  ...  supervision / Distant supervision approaches Noise Reduction Approaches Auxiliary Information Embedding based approaches Lexical analysis / phrase patterns Syntactical analysis and dependency Trees Joint Entity  ... 
doi:10.5281/zenodo.6521883 fatcat:swra4e7rf5dljhjojl7nc2cq4i

Large Language Models for Latvian Named Entity Recognition

Rinalds Viksna, Inguna Skadina
2020 Human Language Technology - The Baltic Perspectiv  
Transformer-based language models pre-trained on large corpora have demonstrated good results on multiple natural language processing tasks for widely used languages including named entity recognition  ...  We demonstrate that the Latvian BERT model, pre-trained on large Latvian corpora, achieves better results (81.91 F1-measure on average vs 78.37 on M-BERT for a dataset with nine named entity types, and  ...  Acknowledgements This research has been supported by the European Regional Development Fund within the joint project of SIA TILDE and University of Latvia "Multilingual Artificial Intelligence Based Human  ... 
doi:10.3233/faia200603 dblp:conf/hlt/ViksnaS20 fatcat:ppejs74ihnb3njupofnzsz7vq4

The RELX Dataset and Matching the Multilingual Blanks for Cross-Lingual Relation Classification [article]

Abdullatif Köksal, Arzucan Özgür
2020 arXiv   pre-print
We also provide the RELX-Distant dataset, which includes hundreds of thousands of sentences with relations from Wikipedia and Wikidata collected by distant supervision for these languages.  ...  To overcome this issue, we propose two cross-lingual relation classification models: a baseline model based on Multilingual BERT and a new multilingual pretraining setup, which significantly improves the  ...  question answering (Abdou et al., 2019) and named entity recognition (Nothman et al., 2013) .  ... 
arXiv:2010.09381v1 fatcat:3tbw5kmvhvbpbaaweq6sdcyjv4

What Matters for Neural Cross-Lingual Named Entity Recognition: An Empirical Analysis

Xiaolei Huang, Jonathan May, Nanyun Peng
2019 Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)  
Building named entity recognition (NER) models for languages that do not have much training data is a challenging task.  ...  We further analyze how transfer learning works for cross-lingual NER on two transferable factors: sequential order and multilingual embeddings, and investigate how model performance varies across entity  ...  Introduction Named Entity Recognition (NER) is an important NLP task that identifies the boundary and type of named entities (e.g., person, organization, location) in texts.  ... 
doi:10.18653/v1/d19-1672 dblp:conf/emnlp/HuangMP19 fatcat:bh52tcf7srepzebk5yxteavnc4

What Matters for Neural Cross-Lingual Named Entity Recognition: An Empirical Analysis [article]

Xiaolei Huang, Jonathan May, Nanyun Peng
2019 arXiv   pre-print
Building named entity recognition (NER) models for languages that do not have much training data is a challenging task.  ...  We further analyze how transfer learning works for cross-lingual NER on two transferable factors: sequential order and multilingual embeddings, and investigate how model performance varies across entity  ...  Introduction Named Entity Recognition (NER) is an important NLP task that identifies the boundary and type of named entities (e.g., person, organization, location) in texts.  ... 
arXiv:1909.03598v1 fatcat:7itaxn5l2vdeloq2xiyjwt7voa
« Previous Showing results 1 — 15 out of 3,263 results