Filters








191 Hits in 6.0 sec

Scalable and Distributed Methods for Entity Matching, Consolidation and Disambiguation Over Linked Data Corpora

Aidan Hogan, Antoine Zimmermann, JJrgen Umbrich, Axel Polleres, Stefan Decker
2012 Social Science Research Network  
With respect to large-scale, static, Linked Data corpora, in this paper we discuss scalable and distributed methods for entity consolidation (aka. smushing, entity resolution, object consolidation, etc  ...  Our methods are based upon distributed sorts and scans of the corpus, where we deliberately avoid the requirement for indexing all data.  ...  [57] present the LinksB2N system, which aims to perform scalable integration of RDF data, particularly focusing on evaluation over corpora from the marketing domain; however, their methods are not specific  ... 
doi:10.2139/ssrn.3198933 fatcat:rkewnbi6rvbyzj4zbczjkb2y5u

Scalable and distributed methods for entity matching, consolidation and disambiguation over linked data corpora

Aidan Hogan, Antoine Zimmermann, Jürgen Umbrich, Axel Polleres, Stefan Decker
2012 Journal of Web Semantics  
With respect to large-scale, static, Linked Data corpora, in this paper we discuss scalable and distributed methods for entity consolidation (aka. smushing, entity resolution, object consolidation, etc  ...  Our methods are based upon distributed sorts and scans of the corpus, where we deliberately avoid the requirement for indexing all data.  ...  [57] present the LinksB2N system, which aims to perform scalable integration of RDF data, particularly focusing on evaluation over corpora from the marketing domain; however, their methods are not specific  ... 
doi:10.1016/j.websem.2011.11.002 fatcat:gekxu2bqarbt5h2mkbtpedbj5m

Consolidating Heterogeneous Enterprise Data for Named Entity Linking and Web Intelligence

Albert Weichselbraun, Daniel Streiff, Arno Scharl
2015 International journal on artificial intelligence tools  
We identify the major challenges of tapping into such sources for named entity linking, and describe required data pre-processing techniques to use and integrate such data sets, with a special focus on  ...  disambiguation and ranking algorithms.  ...  Acknowledgment The research presented in this paper has been conducted as part of the COMET Project (www.htwchur.ch/comet), funded by the Swiss Commission for Technology and Innovation (CTI), and the DecarboNet  ... 
doi:10.1142/s0218213015400084 fatcat:spepviuaercwlczu7lscf2k4mu

Recent advances in methods of lexical semantic relatedness – a survey

ZIQI ZHANG, ANNA LISA GENTILE, FABIO CIRAVEGNA
2012 Natural Language Engineering  
DEDICATION This thesis is dedicated to my brilliant wife, Yaxin Liu, for her infinite love and support throughout the course of this work.  ...  Resolving ambiguity concerns recognising the true referent entity of a name reference, essentially a further named entity 'recognition' step and often a compulsory pro-VI  ...  Unstructured Corpora Unstructured corpora can be considered the background information resource for distributional similarity methods.  ... 
doi:10.1017/s1351324912000125 fatcat:b62qbqwrqfaf3gytw22yktc5ae

Author Name Disambiguation in Bibliographic Databases: A Survey [article]

Muhammad Shoaib, Ali Daud, Tehmina Amjad
2020 arXiv   pre-print
Author Name Disambiguation (AND) in Bibliographic Databases (BD) like DBLP , Citeseer , and Scopus is a specialized field of entity resolution.  ...  Categorization and elaboration of similarity metrics and methods are also provided. Finally, future directions and recommendations are given for this dynamic area of research.  ...  Acknowledgement We are grateful to the Higher Education Commission (HEC) of Pakistan for their financial assistance to promote the research trend in the country under Indigenous 5000 Fellowship Program  ... 
arXiv:2004.06391v1 fatcat:g6ohfpzeejbwhlxmt7vlmyjqo4

Entity-fishing: A DARIAH Entity Recognition and Disambiguation Service

Luca Foppiano, Laurent Romary
2020 Journal of the Japanese Association for Digital Humanities  
Input and output data are carried out over a query data model with a defined structure providing flexibility to support the processing of partially annotated text or the repartition of text over several  ...  This paper presents an attempt to provide a generic named-entity recognition and disambiguation module (NERD) called entity-fishing as a stable online service that demonstrates the possible delivery of  ...  The entity disambiguation task (also called entity linking, named entity disambiguation, named entity recognition and disambiguation) consists of determining the actual identity of the entity which is  ... 
doi:10.17928/jjadh.5.1_22 fatcat:noxggjlhljbqbm3ibwm2p3tjiu

Interactive Ambiguity Resolution of Named Entities in Fictional Literature

Florian Stoffel, Wolfgang Jentner, Michael Behrisch, Johannes Fuchs, Daniel Keim
2017 Computer graphics forum (Print)  
In this paper, we present an interactive NER ambiguity resolution technique, which enables users to create (post-processing) rules for named entity recognition data based on the content and entity context  ...  Abstract Named entity recognition (NER) denotes the task to detect entities and their corresponding classes, such as person or location, in unstructured text data.  ...  Acknowledgments This work was supported by the EU project Visual Analytics for Sense-making in Criminal Intelligence Analysis (VALCRI) under grant number FP7-SEC-2013-608142.  ... 
doi:10.1111/cgf.13179 fatcat:rta7ysin4vhn5mpcu5krhifhme

Graph integration of structured, semistructured and unstructured data for data journalism [article]

Oana Balalau
2020 arXiv   pre-print
corpora, even if they lack the ability to de ne and deploy custom extract-transform-load work ows.  ...  scale, and the solutions we proposed for these problems.  ...  We thank Julien Leblay for his contribution to earlier versions of this work [14, 15] . We thank Xin Zhang for extracting from YAGO 4 the subset used here.  ... 
arXiv:2007.12488v2 fatcat:tus7gf3wdngixnigc6qtjaminq

Searching and Browsing Linked Data with SWSE: The Semantic Web Search Engine

Aidan Hogan, Andreas Harth, JJrgen Umbrich, Sheila Kinsella, Axel Polleres, Stefan Decker
2011 Social Science Research Network  
, SWSE operates over RDF Web dataloosely also known as Linked Data -which implies unique challenges for the system design, architecture, algorithms, implementation and user interface.  ...  Following traditional search engine architecture, SWSE consists of crawling, data enhancing, indexing and a user interface for search, browsing and retrieval of information; unlike traditional search engines  ...  Acknowledgements We would like to thank the anonymous reviewers and the editors for their feedback which helped to improve this paper.  ... 
doi:10.2139/ssrn.3199532 fatcat:ob2ko5yfbzcqpg3fgbrysqstzi

Searching and browsing Linked Data with SWSE: The Semantic Web Search Engine

Aidan Hogan, Andreas Harth, Jürgen Umbrich, Sheila Kinsella, Axel Polleres, Stefan Decker
2011 Journal of Web Semantics  
, SWSE operates over RDF Web dataloosely also known as Linked Data -which implies unique challenges for the system design, architecture, algorithms, implementation and user interface.  ...  Following traditional search engine architecture, SWSE consists of crawling, data enhancing, indexing and a user interface for search, browsing and retrieval of information; unlike traditional search engines  ...  Acknowledgements We would like to thank the anonymous reviewers and the editors for their feedback which helped to improve this paper.  ... 
doi:10.1016/j.websem.2011.06.004 fatcat:lteloasxhvgbhp3256ehrv5wf4

Making sense of social media streams through semantics: A survey

Kalina Bontcheva, Dominic Rout
2014 Semantic Web Journal  
In conclusion, key outstanding challenges are discussed and new directions for research are proposed.  ...  Using semantic technologies for mining and intelligent information access to social media is a challenging, emerging research area.  ...  of Twitter and Facebook messages and for proof-reading the paper.  ... 
doi:10.3233/sw-130110 fatcat:uytdbegs3ngcbpu62i4trrxjni

Web Table Extraction, Retrieval and Augmentation: A Survey [article]

Shuo Zhang, Krisztian Balog
2020 arXiv   pre-print
Tables are a powerful and popular tool for organizing and manipulating data. A vast number of tables can be found on the Web, which represents a valuable knowledge resource.  ...  For each of these tasks, we identify and describe seminal approaches, present relevant resources, and point out interdependencies among the different tasks.  ...  The method utilizes entity linking "impact factors," which are two probabilities, for ranking candidates and for disambiguating entities, based on mention nodes and edges.  ... 
arXiv:2002.00207v2 fatcat:wss5iylwdbh5ziso4fjr4n6zfe

An Experimental Study of State-of-the-Art Entity Alignment Approaches

Xiang Zhao, Weixin Zeng, Jiuyang Tang, Wei Wang, Fabian Suchanek
2020 IEEE Transactions on Knowledge and Data Engineering  
We first propose a general EA framework that encompasses all the current methods, and then group existing methods into three major categories.  ...  Entity alignment (EA) finds equivalent entities that are located in different knowledge graphs (KGs), which is an essential step to enhance the quality of KGs, and hence of significance to downstream applications  ...  already disambiguated entity mentions, and background knowledge such as Wikipedia, to disambiguate linking targets.  ... 
doi:10.1109/tkde.2020.3018741 fatcat:c3fs64qzijcqrmormwwr6r7t2i

WebSets

Bhavana Bharat Dalvi, William W. Cohen, Jamie Callan
2012 Proceedings of the fifth ACM international conference on Web search and data mining - WSDM '12  
In contrast, our method relies on a novel approach for clustering terms found in HTML tables, and then assigning concept names to these clusters using Hearst patterns.  ...  We describe a open-domain information extraction method for extracting concept-instance pairs from an HTML corpus.  ...  Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon.  ... 
doi:10.1145/2124295.2124327 dblp:conf/wsdm/DalviCC12 fatcat:ozcdmq4t75ax5afmdkg35qqtbm

WebSets: Extracting Sets of Entities from the Web Using Unsupervised Information Extraction [article]

Bhavana Dalvi, William W. Cohen, Jamie Callan
2013 arXiv   pre-print
In contrast, our method relies on a novel approach for clustering terms found in HTML tables, and then assigning concept names to these clusters using Hearst patterns.  ...  We describe a open-domain information extraction method for extracting concept-instance pairs from an HTML corpus.  ...  Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon.  ... 
arXiv:1307.0261v1 fatcat:ufqyye2nhjh5vot5afzuglaklm
« Previous Showing results 1 — 15 out of 191 results