27,228 Hits in 6.4 sec

Entity Resolution in a Big Data Framework

Mayank Kejriwal
The thesis is that building a viable Entity Resolution solution for serving Big Data needs requires simultaneously resolving challenges of automation, heterogeneity, scalability and domain independence  ...  Entity Resolution (ER) concerns identifying logically equivalent pairs of entities that may be syntactically disparate.  ...  Enabling a workflow for arbitrary N , which we designate the N-Way problem, is currently an open area of research and involves novel challenges.  ... 
doi:10.1609/aaai.v29i1.9256 fatcat:lpntzupf5rhq3btgvfg46u5v7a

Web-Scale Blocking, Iterative and Progressive Entity Resolution

Kostas Stefanidis, Vassilis Christophides, Vasilis Efthymiou
2017 2017 IEEE 33rd International Conference on Data Engineering (ICDE)  
We focus on Web-scale blocking, iterative and progressive solutions for entity resolution.  ...  We are interested in frameworks addressing the new challenges in entity resolution posed by the Web of data in which real world entities are described by interlinked data rather than documents.  ...  Compared to data warehouses, the new ER challenges stem from the openness of the Web of data in describing entities by an unbounded number of KBs, the semantic and structural diversity of the descriptions  ... 
doi:10.1109/icde.2017.214 dblp:conf/icde/StefanidisCE17 fatcat:bgsihoioavcflhhh3vlxayltre

End-to-End Entity Resolution for Big Data: A Survey [article]

Vassilis Christophides, Vasilis Efthymiou, Themis Palpanas, George Papadakis, Kostas Stefanidis
2020 arXiv   pre-print
ER aims to identify different descriptions that refer to the same real-world entity, and remains a challenging problem.  ...  One of the most important tasks for improving data quality and the reliability of data analytics results is Entity Resolution (ER).  ...  Open-source ER tools We now elaborate on the main systems that are crafted for end-to-end Entity Resolution.  ... 
arXiv:1905.06397v3 fatcat:rs2qoolz2jcppklriew5pjfefq

Cross-Document Coreference Resolution Using Latent Features

Axel-Cyrille Ngonga Ngomo, Michael Röder, Ricardo Usbeck
2014 International Semantic Web Conference  
This task is known as cross-document co-reference resolution and has been addressed by manifold approaches in the past.  ...  Over the last years, entity detection approaches which combine named entity recognition and entity linking have been used to detect mentions of RDF resources from a given reference knowledge base in unstructured  ...  Acknowledgments This work has been supported by the ESF and the Free State of Saxony and the FP7 project GeoKnow (GA No. 318159).  ... 
dblp:conf/semweb/NgomoRU14 fatcat:inwv7amjrfervklpfzmxibbdsi

Big Data and Cross-Document Coreference Resolution: Current State and Future Opportunities [article]

Seyed-Mehdi-Reza Beheshti and Srikumar Venugopal and Seung Hwan Ryu and Boualem Benatallah and Wei Wang
2013 arXiv   pre-print
The aim of this paper is to provide readers with an understanding of the central concepts, subtasks, and the current state-of-the-art in CDCR process.  ...  Recently, document datasets of the order of peta-/tera-bytes has raised many challenges for performing effective CDCR such as scaling to large numbers of mentions and limited representational power.  ...  Conclusions and Future Work In this paper we discussed the central concepts, subtasks, and the current state-of-the-art in Cross-Document Coreference Resolution (CDCR) process.  ... 
arXiv:1311.3987v1 fatcat:qhpa2u4kavd5jjsjvepwrqweue

D2.1 Pid Resolution Services Best Practices

Sara Wimalaratne, Martin Fenner
2018 Zenodo  
This report describes approaches to PID resolution, and sets out best practices to be followed as well as future work in the area and a survey of the practices of different disciplines.  ...  the European Open Science Cloud, and beyond.  ...  It was written by organizations that provide a large part of the PID resolver infrastructure for the European Open Science Cloud and beyond.  ... 
doi:10.5281/zenodo.1324299 fatcat:t2juizluzbe6bcg47pzgawc3de

International Perspectives on Online Dispute Resolution in the E-Commerce Landscape

Teresa Ballesteros
2021 International Journal of Online Dispute Resolution  
This is followed by an analysis of several jurisdictions, namely the United States, China, European Union and Australia.  ...  This article will examine Online Dispute Resolution (ODR) from several perspectives to provide a comprehensive understanding of the global efforts to incorporate ODR in the e-commerce scope.  ...  and neutral entity.  ... 
doi:10.5553/ijodr/235250022021008002002 fatcat:wp7fwtrzuzchbjft2nyyuuo4ze

UMLS to DBPedia link discovery through circular resolution

John Cuzzola, Ebrahim Bagheri, Jelena Jovanovic
2018 JAMIA Journal of the American Medical Informatics Association  
Materials and Methods: We propose a method called circular resolution that utilizes a combination of semantic annotators to map UMLS concepts to DBpedia resources.  ...  Conclusion: The proposed circular resolution method is a simple yet effective technique for linking UMLS concepts to DBpedia resources.  ...  Contributorship Statement The authors declare that this manuscript is a product of original work and each author contributed to the design and interpretation of the results.  ... 
doi:10.1093/jamia/ocy021 pmid:29648604 fatcat:6bbmeyap2fgrbh346ccei6c62u

Analyst's Workspace: An embodied sensemaking environment for large, high-resolution displays

Christopher Andrews, Chris North
2012 2012 IEEE Conference on Visual Analytics Science and Technology (VAST)  
large, high-resolution displays.  ...  By combining spatial layout of documents and other artifacts with an entity-centric, explorative investigative approach, AW aims to allow the analyst to externalize elements of the sensemaking process  ...  Color is used to indicate the current state of the document: open in the workspace (aqua background), selected (blue background), read (light gray text), and unread (black text).  ... 
doi:10.1109/vast.2012.6400559 dblp:conf/ieeevast/AndrewsN12 fatcat:kqsaxufbbnasvdpqgub5mgy74a

Open High-Resolution Satellite Imagery: The WorldStrat Dataset – With Application to Super-Resolution [article]

Julien Cornebise and Ivan Oršolić and Freddie Kalaitzis
2022 arXiv   pre-print
Analyzing the planet at scale with satellite imagery and machine learning is a dream that has been constantly hindered by the cost of difficult-to-access highly-representative high-resolution imagery.  ...  We accompany this dataset with an open-source Python package to: rebuild or extend the WorldStrat dataset, train and infer baseline algorithms, and learn with abundant tutorials, all compatible with the  ...  To Grega Milcinski (Sinergise) and SentinelHub for taking us on board QueryPlanet. To Jamon Van Den Hoek (Oregon State University) for his expertise on GHSL and providing the UNHCR POIs dataset.  ... 
arXiv:2207.06418v1 fatcat:tzrrmwabg5ectlkzgp5vuzienu

A Survey of Blocking and Filtering Techniques for Entity Resolution [article]

George Papadakis, Dimitrios Skoutas, Emmanouil Thanos, Themis Palpanas
2020 arXiv   pre-print
Efficiency techniques are an integral part of Entity Resolution, since its infancy.  ...  In this survey, we organized the bulk of works in the field into Blocking, Filtering and hybrid techniques, facilitating their understanding and use.  ...  This work was partially funded by EU H2020 projects ExtremeEarth (825258) and SmartDataLake (825041).  ... 
arXiv:1905.06167v4 fatcat:zoodv75tazg23cfnq4dwfgt6ge

d-blink: Distributed End-to-End Bayesian Entity Resolution [article]

Neil G. Marchant, Andee Kaplan, Daniel N. Elazar, Benjamin I. P. Rubinstein, Rebecca C. Steorts
2020 arXiv   pre-print
Entity resolution (ER; also known as record linkage or de-duplication) is the process of merging noisy databases, often in the absence of unique identifiers.  ...  Our approach relies on several key ideas, including: (i) an auxiliary variable representation that induces a partition of the entities and records into blocks; (ii) a method for constructing well-balanced  ...  ACKNOWLEDGEMENTS The authors would also like to thank the anonymous reviewers, Associate Editor and Editor for their valuable comments and helpful suggestions.  ... 
arXiv:1909.06039v3 fatcat:56ssspoazbgq5cssm7f4uzqucu

An Ensemble Blocking Scheme for Entity Resolution of Large and Sparse Datasets [article]

Janani Balaji, Faizan Javed, Mayank Kejriwal, Chris Min, Sam Sander, Ozgur Ozturk
2016 arXiv   pre-print
Entity Resolution, also called record linkage or deduplication, refers to the process of identifying and merging duplicate versions of the same entity into a unified representation.  ...  The standard practice is to use a Rule based or Machine Learning based model that compares entity pairs and assigns a score to represent the pairs' Match/Non-Match status.  ...  Entity Resolution (ER), also known as Deduplication or Record Linkage, is a long-established challenge in the Artificial Intelligence (AI) domain, where the goal is to provide accurate, fast and scalable  ... 
arXiv:1609.06265v2 fatcat:spc25iaf3bgejb4iispokirstm

Conflict Detection and Resolution in IoT Systems: A Survey

Pavana Pradeep, Krishna Kant
2022 IoT  
We detail the inherent complexities of the problem, survey the work already performed, and layout the future challenges.  ...  In both cases, this collective behavior often leads to situations where their operation may conflict, and the conflict resolution becomes complex due to lack of visibility into or understanding of the  ...  Goals and Relationships to Other Surveys The purpose of this paper is to provide a state-of-the-art survey of conflict detection and resolution in large IoT systems and point out the research challenges  ... 
doi:10.3390/iot3010012 fatcat:2xobygknerbxbfal27vmghsbyy

AStERISK: Auction-based Shared Economy ResolutIon System for blocKchain [article]

Alberto Sonnino, Michał Król, Argyrios G. Tasiopoulos, Ioannis Psaras
2019 arXiv   pre-print
However, the majority of the systems being developed, does not provide mechanisms to pair workers and clients, or rely on manual and insecure resolution.  ...  Recent developments in blockchains and edge computing allows to deploy decentralized shared economy with utility tokens, where altcoins secure and reward useful work.  ...  As a result, the majority of users currently rely on clouds for applications such as hosting services, offloading computation, and data storage.  ... 
arXiv:1901.07824v1 fatcat:sk2y7nxemzfzzloqbiwz2fpoze
« Previous Showing results 1 — 15 out of 27,228 results