Filters








623 Hits in 5.7 sec

Harnessing Historical Corrections To Build Test Collections For Named Entity Disambiguation

Reitz
2018 Zenodo  
Talk given at TPDL 2018, Porto, September 11, 2018
doi:10.5281/zenodo.1413459 fatcat:eh6uzzvebbe7vomaeqqowyzkxi

Harnessing Historical Corrections to Build Test Collections for Named Entity Disambiguation [chapter]

Florian Reitz
2018 Lecture Notes in Computer Science  
Matching mentions of persons to the actual persons (the name disambiguation problem) is central for several digital library applications.  ...  One problem is that test collections for this problem are often small and specific to a certain collection.  ...  The author thanks Oliver Hoffmann for providing the data on which the dblp test collection is built and Marcel R. Ackermann for helpful discussions and suggestions.  ... 
doi:10.1007/978-3-030-00066-0_4 fatcat:jcqoka6ajnhpvndkw5exeyltyi

Discovering and disambiguating named entities in text

Johannes Hoffart
2013 Proceedings of the 2013 Sigmod/PODS Ph.D. symposium on PhD symposium - SIGMOD'13 PhD Symposium  
A key challenge is the ambiguity of entity names, requiring robust methods to disambiguate names to canonical entities registered in a knowledge base.  ...  This dissertation develops methods to discover and disambiguate named entities, thus linking texts to knowledge bases.  ...  This task is similar in nature to named entity disambiguation: the goal is to disambiguate all words (often excluding named entities) in an input text to their correct meanings.  ... 
doi:10.1145/2483574.2483582 dblp:conf/sigmod/Hoffart13 fatcat:bjqq5uusrragdm7ewraf63of4e

Exploring entity recognition and disambiguation for cultural heritage collections

S. van Hooland, M. De Wilde, R. Verborgh, T. Steiner, R. Van de Walle
2013 Digital Scholarship in the Humanities  
This paper explores the possibilities and limitations of Named-Entity Recognition (NER) and Term Extraction (TE) to mine such unstructured metadata for meaningful concepts.  ...  By doing so, the paper offers a significant contribution towards understanding the value of entity recognition and disambiguation for the Digital Humanities.  ...  However, the test corpus consists of 3,724 historical Wikipedia articles, whose form and content may be inherently more suited for NER than descriptive metadata fields from a museum collection.  ... 
doi:10.1093/llc/fqt067 dblp:journals/lalc/HoolandWVSW15 fatcat:bwc7u7vlr5ecbhyoox23kw3nqq

(Ch. 4) From Index Locorum To Citation Network: An Approach To The Automatic Extraction Of Canonical References And Its Applications To The Study Of Classical Texts

Matteo Romanello
2017 Zenodo  
A common approach to disambiguating named entities is to use links to Wikipedia pages as unique identifiers.  ...  One possible way of mimicking to some extent how context works would be to build for each entity in the knowledge base a list of cooccurring words.  ... 
doi:10.5281/zenodo.793171 fatcat:lhl4nloedrgdhpmvvdwj42yiry

Recent advances in methods of lexical semantic relatedness – a survey

ZIQI ZHANG, ANNA LISA GENTILE, FABIO CIRAVEGNA
2012 Natural Language Engineering  
DEDICATION This thesis is dedicated to my brilliant wife, Yaxin Liu, for her infinite love and support throughout the course of this work.  ...  Resolving ambiguity concerns recognising the true referent entity of a name reference, essentially a further named entity 'recognition' step and often a compulsory pro-VI  ...  It is a fundamental building block of the Named Entity Disambiguation approach to be discussed in the next chapter of this thesis.  ... 
doi:10.1017/s1351324912000125 fatcat:b62qbqwrqfaf3gytw22yktc5ae

Member activities and quality of tags in a collection of historical photographs in Flickr

Besiki Stvilia, Corinne Jörgensen
2010 Journal of the American Society for Information Science and Technology  
Nouns, named entity terms, and complex terms constituted approximately 77% of the preprocessed set.  ...  of their photo collections to make the collections more accessible and visible.  ...  Acknowledgements The authors would like to express their gratitude to Nicole Alemanne and Shuheng Wu for helpful conversations.  ... 
doi:10.1002/asi.21432 fatcat:tl77mwaigfg2faxvm4rygltpby

Learning multilingual named entity recognition from Wikipedia

Joel Nothman, Nicky Ringland, Will Radford, Tara Murphy, James R. Curran
2013 Artificial Intelligence  
We automatically create enormous, free and multilingual silver-standard training annotations for named entity recognition (ner) by exploiting the text and structure of Wikipedia.  ...  We first classify each Wikipedia article into named entity (ne) types, training and evaluating on 7200 manually-labelled Wikipedia articles across nine languages.  ...  Acknowledgements We would like to thank members of Schwa Lab and the anonymous reviewers for their helpful feedback on all of the research described here.  ... 
doi:10.1016/j.artint.2012.03.006 fatcat:7agjkau5wfhqbeyit3sddv2ggy

Introduction [chapter]

Krisztian Balog
2018 Advanced Topics in Information Retrieval  
as entities).  ...  The objective of this book is to give a detailed account of the developments of a decade of IR research that have enabled us to search for "things, not strings."  ...  At the same time, the massive volumes of usage data collected from users allows for improved methods, by harnessing the "wisdom of the crowds."  ... 
doi:10.1007/978-3-319-93935-3_1 fatcat:d7pyoekwqjenbba5bagxgmxkee

Geographic Information Retrieval

Ross Purves, Christopher Jones
2011 SIGSPATIAL Special  
The scope includes, but is not limited to, geographic information systems.  ...  The SIGSPATIAL Special is the newsletter of the Association for Computing Machinery (ACM) Special Interest Group on Spatial Information (SIGSPATIAL).  ...  Acknowledgements I would like to thank Adam Rae for providing the figure for this paper, and Mor Naaman for providing critical feedback.  ... 
doi:10.1145/2047296.2047297 fatcat:npmfk7kdhrailfrgyyaszkj6wi

Different German and English Coreference Resolution Models for Multi-domain Content Curation Scenarios [chapter]

Ankit Srivastava, Sabine Weber, Peter Bourgonje, Georg Rehm
2018 Lecture Notes in Computer Science  
Coreference Resolution is the process of identifying all words and phrases in a text that refer to the same entity.  ...  It has proven to be a useful intermediary step for a number of natural language processing applications.  ...  We would like to thank the anonymous reviewers for their insightful and helpful comments.  ... 
doi:10.1007/978-3-319-73706-5_5 fatcat:ycsjxih7infbzpu6qvc626lgwq

Indigenous frameworks for data-intensive humanities: recalibrating the past through knowledge engineering and generative modelling

Sydney Shep, Marcus Frean, Rhys Owen, Rere-No-A-Rangi Pope, Pikihuia Reihana, Valerie Chan
2021 Journal of Data Mining and Digital Humanities  
Without accurate data or tools to har-monise existing fragmented or conflicting data sources, issues around land succession, opportunities for economic development, and maintenance of whānau relationships  ...  This paper provides an overview of VUW's culturally-embedded social network approach to the project, discusses the challenges of working within an indigenous worldview, and emphasises the importance of  ...  The authors acknowledge support and funding from the Science for Technological Innovation [2020] National Science Challenge Spearhead Project "Analytics to identify and connect successors to whenua"; the  ... 
doi:10.46298/jdmdh.6095 fatcat:vrqheesznjda5mbo235drbxmwm

Geoparsing history: Locating commodities in ten million pages of nineteenth-century sources

Jim Clifford, Beatrice Alex, Colin M. Coates, Ewan Klein, Andrew Watson
2016 Historical Methods  
We selected these four collections for a number of reasons. We looked for very large collections that would provide the quantity of data needed to test the effectiveness of our text mining methods.  ...  Our text mining system tries to find grounding identifiers for all named entity mentions that it has tagged.  ... 
doi:10.1080/01615440.2015.1116419 fatcat:tep5occm3zciborvtxluvpxtx4

Semantics-Empowered Big Data Processing with Applications

Krishnaprasad Thirunarayan, Amit Sheth
2015 The AI Magazine  
To handle volume, we advocate semantic perception that can convert low-level observational data to higher-level abstractions more suitable for decision-making.  ...  To handle velocity, we seek to use continuous semantics capability to dynamically create event or situation specific models and recognize relevant new concepts, entities and facts.  ...  Acknowledgements We acknowledge Cory Henson for significant contributions on Semantic Perception, Pramod Anantharam on hybridization of statistical and logic-based techniques, and in dealing with real-world  ... 
doi:10.1609/aimag.v36i1.2566 fatcat:utph5wmxvzb6ldoz53pusypzjy

Self-monitoring in social networks

Amin Anjomshoaa, Khue Vo Sao, Amirreza Tahamtan, A. Min Tjoa, Edgar Weippl
2012 International Journal of Intelligent Information and Database Systems  
These components can be used by the same method can be also used by individuals to make a self-test of their Web 2.0 contributions and find out what inferences will be derived from their web presence.  ...  This paper proposes a Web 2.0 analysis methodology that provides reusable foundation components for information extraction, analysis and visualization.  ...  to harness collective intelligence (O'Reilly, 2005) .  ... 
doi:10.1504/ijiids.2012.049110 fatcat:rlakxir2yrgzbeyp6dwkaxtxna
« Previous Showing results 1 — 15 out of 623 results