Filters








242 Hits in 7.0 sec

Named entity recognition with document-specific KB tag gazetteers

Will Radford, Xavier Carreras, James Henderson
2015 Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing  
We consider a novel setting for Named Entity Recognition (NER) where we have access to document-specific knowledge base tags.  ...  Moreover, augmenting document-specific gazetteers with KB information lets users specify fewer tags for the same performance, reducing cost.  ...  Document-level KB tags We incorporate information from KB tags by building document-specific gazetteers.  ... 
doi:10.18653/v1/d15-1058 dblp:conf/emnlp/RadfordCH15 fatcat:2p2f6kkr7zhfzggk545tfph34i

Challenges in Information Retrieval from Unstructured Arabic Data

Hussein Khalil, Taha Osman
2014 2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation  
Moreover, to improve the intelligent exploration of unstructured documents in the Arabic domain.  ...  This paper investigates the Semantic Web (SW) support for handling documents that are authored and/or annotated in Arabic, and how to bridge the gap between the SW and Natural Language Processing (NLP)  ...  Two approaches were used in addressing the document; the first used a gazetteer to perform named entity recognition.  ... 
doi:10.1109/uksim.2014.115 dblp:conf/uksim/KhalilO14 fatcat:qkys7hsn6zcehj3ubxdrzhpwfm

KnowNER: Incremental Multilingual Knowledge in Named Entity Recognition [article]

Dominic Seyler, Tatiana Dembelova, Luciano Del Corro, Johannes Hoffart, Gerhard Weikum
2017 arXiv   pre-print
KnowNER is a multilingual Named Entity Recognition (NER) system that leverages different degrees of external knowledge.  ...  Each category consists of a set of features automatically generated from different information sources (such as a knowledge-base, a list of names or document-specific semantic annotations) and is used  ...  [18] , after the first run of NED, we create a set of document-specific gazetteers derived from the named entities found.  ... 
arXiv:1709.03544v1 fatcat:nfyisnvgx5gqtelhe4vx4ixvze

A Study of the Importance of External Knowledge in the Named Entity Recognition Task

Dominic Seyler, Tatiana Dembelova, Luciano Del Corro, Johannes Hoffart, Gerhard Weikum
2018 Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)  
In this work, we discuss the importance of external knowledge for performing Named Entity Recognition (NER).  ...  Each category consists of a set of features automatically generated from different information sources, such as a knowledgebase, a list of names, or document-specific semantic annotations.  ...  ., 2015) , after the first run of NED, we create a set of document-specific gazetteers derived from the disambiguated entities.  ... 
doi:10.18653/v1/p18-2039 dblp:conf/acl/SeylerDCHW18 fatcat:2ujifnhbm5ah3do5aaml6ps5c4

Soft Gazetteers for Low-Resource Named Entity Recognition [article]

Shruti Rijhwani, Shuyan Zhou, Graham Neubig, Jaime Carbonell
2020 arXiv   pre-print
Traditional named entity recognition models use gazetteers (lists of entities) as features to improve performance.  ...  To address this problem, we propose a method of "soft gazetteers" that incorporates ubiquitously available information from English knowledge bases, such as Wikipedia, into neural named entity recognition  ...  We also thank Samridhi Choudhary for help with the model implementation and Deepak Gopinath for feedback on the paper.  ... 
arXiv:2005.01866v1 fatcat:m43ofz2bhndgrbbl32rsy7vshm

Natural Language Processing for Information Extraction [article]

Sonit Singh
2018 arXiv   pre-print
Various sub-tasks of IE such as Named Entity Recognition, Coreference Resolution, Named Entity Linking, Relation Extraction, Knowledge Base reasoning forms the building blocks of various high end Natural  ...  With rise of digital age, there is an explosion of information in the form of news, articles, social media, and so on.  ...  in Named Entity Recognition (NER) Named Entity Recognition can be seen as a word-level tagging problem where each word in a sentence is mapped to a named entity tag.  ... 
arXiv:1807.02383v1 fatcat:3bdyidbjp5hn7c2w4iqve4ajvi

Automatically Annotated Turkish Corpus for Named Entity Recognition and Text Categorization using Large-Scale Gazetteers [article]

H. Bahadir Sahin, Caglar Tirkaz, Eray Yildiz, Mustafa Tolga Eren, Ozan Sonmez
2017 arXiv   pre-print
The constructed gazetteers contains approximately 300K entities with thousands of fine-grained entity types under 77 different domains.  ...  We make these datasets publicly available to support studies on Turkish named-entity recognition (NER) and text categorization (TC).  ...  Introduction Named-entity recognition (NER) is an information extraction (IE) task that aims to detect and categorize entities to pre-defined types in a text.  ... 
arXiv:1702.02363v2 fatcat:i7zt6gcncjhhxkaqiiyaarg2mi

Semi-Supervised Bootstrapping Approach for Named Entity Recognition

Thenmalar S, Balaji J, Geetha T.V
2015 International Journal on Natural Language Computing  
The aim of Named Entity Recognition (NER) is to identify references of named entities in unstructured documents, and to classify them into pre-defined semantic categories.  ...  We have evaluated the proposed system for English language with the dataset of tagged (IEER) and untagged (CoNLL 2003) named entity corpus and for Tamil language with the documents from the FIRE corpus  ...  The lists of entities stated in the document are not usually available in general named entity recognition.  ... 
doi:10.5121/ijnlc.2015.4501 fatcat:c2k23azrhzgkxjtbhev47ej4vm

Adaptive Geoparsing Method for Toponym Recognition and Resolution in Unstructured Text

Edwin Aldana-Bobadilla, Alejandro Molina-Villegas, Ivan Lopez-Arevalo, Shanel Reyes-Palacios, Victor Muñiz-Sanchez, Jean Arreola-Trapala
2020 Remote Sensing  
We include other assessment measures to assess the recognition ability of place names and the prediction of what we called geographic levels (administrative jurisdiction of places).  ...  Computer systems capable of discovering geographic information from natural language involve a complex process called geoparsing, which includes two important tasks: geographic entity recognition and toponym  ...  Geographic-Named Entity Recognition We have obtained the semantic features based on word embeddings obtained with word2vec [29] .  ... 
doi:10.3390/rs12183041 doaj:2a94e8c05d16492f856aa3ed81fb4916 fatcat:odfrdlic2fa37ahqtpx624c7wa

Semi-supervised Bootstrapping approach for Named Entity Recognition [article]

S. Thenmalar, J. Balaji, T.V. Geetha
2015 arXiv   pre-print
The aim of Named Entity Recognition (NER) is to identify references of named entities in unstructured documents, and to classify them into pre-defined semantic categories.  ...  We have evaluated the proposed system for English language with the dataset of tagged (IEER) and untagged (CoNLL 2003) named entity corpus and for Tamil language with the documents from the FIRE corpus  ...  The lists of entities stated in the document are not usually available in general named entity recognition.  ... 
arXiv:1511.06833v1 fatcat:cpllg3sbrzffxady3gzewf3lwm

Theophrastus : On demand and real-time automatic annotation and exploration of (web) documents using open linked data

Pavlos Fafalios, Panagiotis Papadakos
2014 Journal of Web Semantics  
Theophrastus is a system that supports the automatic annotation of (web) documents through entity mining and provides exploration services by exploiting Linked Open Data (LOD), in real-time and only when  ...  Theophrastus has been designed to be highly configurable regarding a number of different aspects like entities of interest, information cards and external search systems.  ...  Theophrastus can use any named entity recognition service that takes as input a text and returns a list of entity names.  ... 
doi:10.1016/j.websem.2014.07.009 fatcat:c3jhebkycbfvvffqpqaq4tyake

Theophrastus: On Demand and Real-Time Automatic Annotation and Exploration of (Web) Documents Using Open Linked Data

Pavlos Fafalios, Panagiotis Papadakos
2014 Social Science Research Network  
Theophrastus is a system that supports the automatic annotation of (web) documents through entity mining and provides exploration services by exploiting Linked Open Data (LOD), in real-time and only when  ...  Theophrastus has been designed to be highly configurable regarding a number of different aspects like entities of interest, information cards and external search systems.  ...  Theophrastus can use any named entity recognition service that takes as input a text and returns a list of entity names.  ... 
doi:10.2139/ssrn.3199160 fatcat:vfnftxdp2rabfldijf37wey3jy

Name Tagging for Low-resource Incident Languages based on Expectation-driven Learning

Boliang Zhang, Xiaoman Pan, Tianlu Wang, Ashish Vaswani, Heng Ji, Kevin Knight, Daniel Marcu
2016 Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies  
In this paper we tackle a challenging name tagging problem in an emergent setting -the tagger needs to be complete within a few hours for a new incident language (IL) using very few resources.  ...  linguistic knowledge from native speakers, mining and projecting patterns from both mono-lingual and cross-lingual corpora, and typing based on cross-lingual entity linking.  ...  The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the U.S. Government.  ... 
doi:10.18653/v1/n16-1029 dblp:conf/naacl/ZhangPWVJKM16 fatcat:owju6kda2vhupntwrrkrzn24oe

Saarland University Spoken Language Systems at the Slot Filling Task of TAC KBP 2010

Grzegorz Chrupala, Saeedeh Momtazi, Michael Wiegand, Stefan Kazalski, Fang Xu, Benjamin Roth, Alexandra Balahur, Dietrich Klakow
2010 Text Analysis Conference  
Named Entity Recognition Both the sentence retrieval and the relation extraction components of our system need access to named entity (NE) labels specific to the slot filling task.  ...  By including that entity in our query, we ensure to retrieve sentences dealing with the target. • The expected named entity: The named entity tag of the slot value that is to be found is called the expected  ... 
dblp:conf/tac/ChrupalaMWKXRBK10 fatcat:yfupep33vrbnlonog3wmhlhlqe

Semi-Automatic Semantic Annotations for Web Documents

Nadzeya Kiyavitskaya, Nicola Zeni, James R. Cordy, Luisa Mich, John Mylopoulos
2005 Semantic Web Applications and Perspectives  
Semantic annotation of the web documents is the only way to make the Semantic Web vision a reality.  ...  Considering the scale and dynamics of worldwide web, the largest knowledge base ever built, it becomes clear that we cannot afford to annotate web documents manually.  ...  Text is preprocessed using Annie, shallow information extraction system included in Gate package (text tokenization, sentence splitting, part of speech tagging, gazetteer lookup and named entity recognition  ... 
dblp:conf/swap/KiyavitskayaZCMM05 fatcat:7rmrppdamzhkvnprobdzlmndvu
« Previous Showing results 1 — 15 out of 242 results