Filters








1,239 Hits in 3.4 sec

Managing the Quality of Person Names in DBLP [chapter]

Patrick Reuther, Bernd Walter, Michael Ley, Alexander Weber, Stefan Klink
2006 Lecture Notes in Computer Science  
The following paper gives a short overview on DBLP in which the data acquisition and maintenance process underlying DBLP is discussed from a quality point of view.  ...  The paper finishes with a new approach to identify erroneous person names.  ...  For services offering access to scientific publications data quality management, a part of quality management in general, is the central challenge.  ... 
doi:10.1007/11863878_55 fatcat:tvk77wqei5cnxemmzq4e2rvsne

Matching person names through name transformation

Jun Gong, Lidan Wang, Douglas W. Oard
2009 Proceeding of the 18th ACM conference on Information and knowledge management - CIKM '09  
Common name variations in the English speaking world are formalized, and the concept of name transformation paths is introduced; name similarity is measured after the best transformation path has been  ...  In this paper, a novel person name matching model is presented.  ...  This work was completed during a visit by the first author to the UMIACS CLIP lab at the University of Maryland that was supported by National Natural Science Foundation of China Grant 70671007.  ... 
doi:10.1145/1645953.1646253 dblp:conf/cikm/GongWO09 fatcat:trs7fo4a2ndnrjk46olo7gqwo4

Publishing bibliographic data on the Semantic Web using BibBase

Reynold S. Xin, Oktie Hassanzadeh, Christian Fritz, Shirin Sohrabi, Renée J. Miller
2013 Semantic Web Journal  
In this demo, we present a brief overview of the features of our system and outline a few challenges in the design and implementation of such a system.  ...  We present BibBase, a system for publishing and managing bibliographic data available in BibTeX files on the Semantic Web.  ...  In this example, the assumption is that the combination of the first letter of first name, middle name, and last name, "JBSmith", is a unique identifier for a person in a single file.  ... 
doi:10.3233/sw-2012-0062 fatcat:xg2t7vb4y5dspg3kpkye4afbha

Integration of Japanese Papers Into the DBLP Data Set [article]

Paul Christian Sommerhoff
2017 arXiv   pre-print
If someone is looking for a certain publication in the field of computer science, the searching person is likely to use the DBLP to find the desired publication.  ...  Especially important are the problems along the way of processing, such as transcription handling and Personal Name Matching with Japanese names.  ...  Japanese scientists might look for the original (Japanese) title "土木関連用語辞典の見 出し語の分析と検索システムにおける活用に関する考察" or use Aizawa's name in  ... 
arXiv:1709.09119v1 fatcat:2rv36pef5ffa7ai72ujinkmloe

Developing a Temporal Bibliographic Data Set for Entity Resolution [article]

Yichen Hu, Qing Wang, Peter Christen
2018 arXiv   pre-print
We selected around 80K (1%) of author profiles that cover 2 million (50%) publications using information in DBLP such as alternative author names and personal web profile to improve the reliability of  ...  We completed missing links between publications and author profiles in the DBLP data set using the DBLP public API.  ...  the existence of multiple names of the same profile, as well as persistent and non-persistent personal URLs. • We make the generated temporal data set available on GitHub 4 .  ... 
arXiv:1806.07524v1 fatcat:6ksvdatj3za37o3qzon4c3o36m

DBLP

Michael Ley
2009 Proceedings of the VLDB Endowment  
Many design decisions and details of the public XML-records behind DBLP never were documented.  ...  The DBLP Computer Science Bibliography evolved from an early small experimental Web server to a popular service for the computer science community.  ...  A fourth group of papers deals with person name disambiguation, a special aspect of data quality. DBLP is a (very imperfect) "authority file" [1] for computer science researchers.  ... 
doi:10.14778/1687553.1687577 fatcat:ipvzgaxpbrhrvi73curobjr42i

Semantic analytics on social networks

Boanerges Aleman-Meza, Meenakshi Nagarajan, Cartic Ramakrishnan, Li Ding, Pranam Kolari, Amit P. Sheth, I. Budak Arpinar, Anupam Joshi, Tim Finin
2006 Proceedings of the 15th international conference on World Wide Web - WWW '06  
network of the DBLP bibliography.  ...  We describe our experiences developing this application in the context of a class of Semantic Web applications, which have important research and engineering challenges in common.  ...  and the DBLP co-authorship network.  ... 
doi:10.1145/1135777.1135838 dblp:conf/www/Aleman-MezaNRDKSAJF06 fatcat:qkii3fqt2fhj3d3aaeizfw5wke

Harnessing Historical Corrections to Build Test Collections for Named Entity Disambiguation [chapter]

Florian Reitz
2018 Lecture Notes in Computer Science  
Matching mentions of persons to the actual persons (the name disambiguation problem) is central for several digital library applications.  ...  One collection focuses on the properties of defects and one on the evaluation of disambiguation algorithms.  ...  Acknowledgements The research in this paper is funded by the Leibniz Competition, grant no. LZI-SAW-2015-2.  ... 
doi:10.1007/978-3-030-00066-0_4 fatcat:jcqoka6ajnhpvndkw5exeyltyi

Why name ambiguity resolution matters for scholarly big data research

Jinseok Kim, Jana Diesner, Heejun Kim, Amirhossein Aleyasen, Hwan-Min Kim
2014 2014 IEEE International Conference on Big Data (Big Data)  
The gaps between outcomes of name ambiguity resolution methods range from -4.23% to -87.36% per dataset for the number of unique authors, from 3.75% to 691.20% for average productivity, and from 5.06%  ...  The comparison of resulting bibliometric and network properties revealed that initial-disambiguation bears the prevalent risks of incorrectly merging author identities, underestimating the number of unique  ...  ACKNOWLEDGMENT We thank Brian Karrer (Facebook), Travis Martin (Univ. of Michigan), Brian Ball (Dotomi Inc.), and Mark Newman (Univ. of Michigan) for helping us to disambiguate author names in the APS  ... 
doi:10.1109/bigdata.2014.7004345 dblp:conf/bigdataconf/KimDKAK14 fatcat:peovaf6g2fgwllkukghmziueqa

Finding scientific papers with homepagesearch and MOPS

Gerd Hoff, Martin Mundhenk
2001 Proceedings of the 19th annual international conference on Computer documentation - SIGDOC '01  
The names of these scientists are obtained from the DBLP server [9]. The HomePageSearch system finds the Home Pages according to the names, and Mops finds research papers close to the Home Pages.  ...  We conclude that such a focused crawling is very effective for building high-quality collections and indices of scientific papers, using ordinary desktop hardware.  ...  Acknowledgement We thank Michael Ley for providing us with data from DBLP [9] and for very helpful discussions.  ... 
doi:10.1145/501516.501556 dblp:conf/sigdoc/HoffM01 fatcat:rqoq62du5rf23k2c4juujf53iu

Finding scientific papers with homepagesearch and MOPS

Gerd Hoff, Martin Mundhenk
2001 Proceedings of the 19th annual international conference on Computer documentation - SIGDOC '01  
The names of these scientists are obtained from the DBLP server [9]. The HomePageSearch system finds the Home Pages according to the names, and Mops finds research papers close to the Home Pages.  ...  We conclude that such a focused crawling is very effective for building high-quality collections and indices of scientific papers, using ordinary desktop hardware.  ...  Acknowledgement We thank Michael Ley for providing us with data from DBLP [9] and for very helpful discussions.  ... 
doi:10.1145/501554.501556 fatcat:ngfgre2j7rhrreylmt5y4uqmqi

Integration of Scholarly Communication Metadata Using Knowledge Graphs [chapter]

Afshin Sadeghi, Christoph Lange, Maria-Esther Vidal, Sören Auer
2017 Lecture Notes in Computer Science  
In this work, we created an integrated graph of scientific knowledge from DBLP and the Microsoft Academic Graph and describe the challenges in matching, linking and integrating the datasets and our approach  ...  As proof of concept, we illustrate the different steps in the construction of a knowledge graph in the domain of scholarly communication metadata (SCM-KG).  ...  Acknowledgments: This work has been partially funded by the European Commission under grant agreements 643410 (OpenAIRE2020) and 644564 (BigDataEurope), and the DFG under grant agreement AU 340/9-1 (OSCOSS  ... 
doi:10.1007/978-3-319-67008-9_26 fatcat:bznxtvfh5rdl7bfwk5neaxq6si

Scalable semantic analytics on social networks for addressing the problem of conflict of interest detection

Boanerges Aleman-Meza, Meenakshi Nagarajan, Li Ding, Amit Sheth, I. Budak Arpinar, Anupam Joshi, Tim Finin
2008 ACM Transactions on the Web  
In this article, we demonstrate the applicability of semantic techniques for detection of Conflict of Interest (COI).  ...  We describe in detail the challenges involved in two important aspects on building Semantic Web applications, namely, data acquisition and entity disambiguation (or reference reconciliation).  ...  For example, DBLP has different entries that in the real world refer to the same person, such as the case of "Ed H. Chi" and "Ed Huai-hsin Chi."  ... 
doi:10.1145/1326561.1326568 fatcat:o42pa4gffnbo5osfjrv326l6qa

The Impact of Name Ambiguity on Properties of Coauthorship Networks

Jinseok Kim, Heejun Kim, Jana Diesner
2014 Journal of Information Science Theory and Practice  
ACKNOWLEDGEMENTS This work is supported by KISTI (Korea Institute of Science and Technology Information), grant P14033, and the FORD Foundation, grant 0145-0558.  ...  three authors If 'Jackson, P. ' has only ONE match candidate with a middle name initial, Jackson, P. Jackson, P. A.  ...  DBLP is well known for its high quality citation data. This is partially due to the fact that the DBLP team has dedicated database management efforts to name disambiguation (Ley, 2002 (Ley, , 2009 .  ... 
doi:10.1633/jistap.2014.2.2.1 fatcat:irbkbkbj35eohnpbprfcv7qp3m

Mining knowledge from databases

Jiawei Han, Yizhou Sun, Xifeng Yan, Philip S. Yu
2010 Proceedings of the 2010 international conference on Management of data - SIGMOD '10  
In this tutorial, we introduce database-oriented information network analysis methods and demonstrate how information networks can be used to improve data quality and consistency, facilitate data integration  ...  , and data qualify improvement, how to discover various kinds of knowledge from information networks, how to perform OLAP in information networks, and how to transform database data into knowledge by information  ...  Similarity study on homogenous networks: Sim-Rank and Personalized PageRank iv. Challenges of heterogenous networks 3.  ... 
doi:10.1145/1807167.1807333 dblp:conf/sigmod/HanSYY10 fatcat:y3ozzuynxfb3rdtrbf3jldjkci
« Previous Showing results 1 — 15 out of 1,239 results