Filters








133 Hits in 6.4 sec

A hierarchical naive Bayes mixture model for name disambiguation in author citations

Hui Han, Wei Xu, Hongyuan Zha, C. Lee Giles
2005 Proceedings of the 2005 ACM symposium on Applied computing - SAC '05  
This paper presents a hierarchical naive Bayes mixture model, an unsupervised learning approach, for name disambiguation in author citations.  ...  This method partitions a collection of citations 1 into clusters, with each cluster containing only citations authored by the same author, thus disambiguating authorship in citations to induce author name  ...  Acknowledgments We wish to thank Mark Stefik and Cheng Li for their valuable comments on our name disambiguation work.  ... 
doi:10.1145/1066677.1066920 dblp:conf/sac/HanXZG05 fatcat:efo7o3b5gjezjn2e577mkufgwy

Ambiguity Resolution for Author Names of Bibliographic Data

Kuang-Ha Chen, Chi-Nan Hsieh
2011 Journal of Educational Media & Library Sciences  
Therefore researches on ambiguity resolution for author name are indispensable.  ...  Users or researchers have been confronted with serious problems in ambiguities of author names, while a great deal of scholar information quickly accumulated in Internet.  ...  (2005a) Hierarchical Naïve Bayes mixture model 1) Publication in author homepages (2 names) 2) C i t a t i o n i n D B L P database (14 names) 1) 65.5% 2) 63.2% Han et al.  ... 
doaj:474499ff95914afda944ba4000bcabf1 fatcat:4hpwgvlv4vde7ma2sqgkxy5oa4

Author Name Disambiguation in Bibliographic Databases: A Survey [article]

Muhammad Shoaib, Ali Daud, Tehmina Amjad
2020 arXiv   pre-print
Author Name Disambiguation (AND) in Bibliographic Databases (BD) like DBLP , Citeseer , and Scopus is a specialized field of entity resolution.  ...  Given many citations of underlying authors, the AND task is to find which citations belong to the same author.  ...  Acknowledgement We are grateful to the Higher Education Commission (HEC) of Pakistan for their financial assistance to promote the research trend in the country under Indigenous 5000 Fellowship Program  ... 
arXiv:2004.06391v1 fatcat:g6ohfpzeejbwhlxmt7vlmyjqo4

A brief survey of automatic methods for author name disambiguation

Anderson A. Ferreira, Marcos André Gonçalves, Alberto H.F. Laender
2012 SIGMOD record  
Name ambiguity in the context of bibliographic citation records is a hard problem that affects the quality of services and content in digital libraries and similar systems.  ...  In [21] , Han et al. present an unsupervised hierarchical version of the naïve Bayes-based method for modeling each author.  ...  The first method uses naïve Bayes (NB), a generative statistical model frequently used in word sense disambiguation, to capture all writing patterns in the authors' citations.  ... 
doi:10.1145/2350036.2350040 fatcat:aoze6hty3rdlbaggngpmmp5wee

Efficient topic-based unsupervised name disambiguation

Yang Song, Jian Huang, Isaac G. Councill, Jia Li, C. Lee Giles
2007 Proceedings of the 2007 conference on Digital libraries - JCDL '07  
After learning an initial model, the topic distributions are treated as feature sets and names are disambiguated by leveraging a hierarchical agglomerative clustering method.  ...  Our models explicitly introduce a new variable for persons and learn the distribution of topics with regard to persons and words.  ...  In [8] , different classification methods such as hybrid Naive Bayes and Support Vector Machines (SVM) have been applied to a DBLP dataset.  ... 
doi:10.1145/1255175.1255243 dblp:conf/jcdl/SongHCLG07 fatcat:k26gwgsok5cqnas7uapu2obzhy

Author disambiguation by hierarchical agglomerative clustering with adaptive stopping criterion

Lei Cen, Eduard C. Dragut, Luo Si, Mourad Ouzzani
2013 Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '13  
In particular, pairwise similarity is first learned for publications that share the same author name string (ANS) and then a novel Hierarchical Agglomerative Clustering approach with Adaptive Stopping  ...  This paper proposes new research for entity disambiguation with the focus of name disambiguation in digital libraries.  ...  It is also partially supported by the Center for Science of Information (CSoI), an NSF Science and Technology Center, under grant agreement CCF-0939370.  ... 
doi:10.1145/2484028.2484157 dblp:conf/sigir/CenDSO13 fatcat:iapphuzx4ra5tavphl3zce4kxe

Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation

Antonio J Jimeno-Yepes, Bridget T McInnes, Alan R Aronson
2011 BMC Bioinformatics  
The term found in the MEDLINE citation is automatically assigned the UMLS CUI linked to the MeSH heading. Each instance has been assigned a UMLS Concept Unique Identifier (CUI).  ...  Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases  ...  Acknowledgements This work was supported in part by the Intramural Research Program of the NIH, National Library of Medicine and by an appointment of A.  ... 
doi:10.1186/1471-2105-12-223 pmid:21635749 pmcid:PMC3123611 fatcat:icdqhrecqrejbp3dxsxshw2ibu

Bayesian Non-Exhaustive Classification A Case Study: Online Name Disambiguation using Temporal Record Streams [article]

Baichuan Zhang, Murat Dundar, Mohammad Al Hasan
2016 arXiv   pre-print
In this work, we propose a Bayesian non-exhaustive classification framework for solving online name disambiguation task.  ...  As a case study we consider bibliographic data in a temporal stream format and disambiguate authors by partitioning their papers into homogeneous groups.  ...  [13] propose supervised name disambiguation methodologies by utilizing Naive Bayes and SVM for name entity disambiguation task.  ... 
arXiv:1607.05746v3 fatcat:mrj5sql5w5d5thsciku35br2sq

Name Authority Challenges for Indexing and Abstracting Databases

Denise Beaubien Bennett, Priscilla Williams
2006 Evidence Based Library and Information Practice  
Methods -The article includes an analysis of current name authority practices in I&A databases and of selected research into name disambiguation models applied to authorship of articles.  ...  Striving for name disambiguation rather than name authority control may become an attractive option for catalogues, I&A databases, and digital library collections.  ...  (Han et al. " A Hierarchical Naïve Bayes Mixture Model" 1065).  ... 
doi:10.18438/b81596 fatcat:os5zeaevkfdx7hzgk6guvpiwsa

Name Authority Challenges for Indexing and Abstracting Databases [chapter]

Denise Beaubien, Priscilla Head
2011 Cataloging and Indexing  
Methods -The article includes an analysis of current name authority practices in I&A databases and of selected research into name disambiguation models applied to authorship of articles.  ...  Striving for name disambiguation rather than name authority control may become an attractive option for catalogues, I&A databases, and digital library collections.  ...  (Han et al. " A Hierarchical Naïve Bayes Mixture Model" 1065).  ... 
doi:10.1201/b13123-3 fatcat:fxbte5z46reaxmquvaoxhaguzu

Bayesian Non-Exhaustive Classification for Active Online Name Disambiguation [article]

Baichuan Zhang, Murat Dundar, Mohammad Al Hasan
2017 arXiv   pre-print
In particular, we present a Dirichlet Process Gaussian Mixture Model (DPGMM) as a core engine for online name disambiguation task.  ...  Toward achieving this objective, in this paper we propose a Bayesian non-exhaustive classification frame- work for solving online name disambiguation.  ...  [12] propose supervised name disambiguation methodologies by utilizing Naive Bayes and SVM.  ... 
arXiv:1708.04531v1 fatcat:mp6ijc3zcrec7ekt3wkfns6exq

Bayesian Non-Exhaustive Classification A Case Study

Baichuan Zhang, Murat Dundar, Mohammad Al Hasan
2016 Proceedings of the 25th ACM International on Conference on Information and Knowledge Management - CIKM '16  
In this work, we propose a Bayesian non-exhaustive classification framework for solving online name disambiguation task.  ...  As a case study we consider bibliographic data in a temporal stream format and disambiguate authors by partitioning their papers into homogeneous groups.  ...  [11] propose supervised name disambiguation methodologies by utilizing Naive Bayes and SVM for name entity disambiguation task.  ... 
doi:10.1145/2983323.2983714 dblp:conf/cikm/ZhangDH16 fatcat:flelyqw2ejh4bndeagaj2kahhe

A tool for generating synthetic authorship records for evaluating author name disambiguation methods

Anderson A. Ferreira, Marcos André Gonçalves, Jussara M. Almeida, Alberto H.F. Laender, Adriano Veloso
2012 Information Sciences  
In order to facilitate the evaluation of name disambiguation methods in various realistic scenarios and under controlled conditions, in this article we propose SyGAR, a new Synthetic Generator of Authorship  ...  The author name disambiguation task has to deal with uncertainties related to the possible many-to-many correspondences between ambiguous names and unique authors.  ...  Acknowledgments This research is partially funded by InWeb -The National Institute of Science and Technology for the Web (MCT/CNPq/ FAPEMIG Grant No. 573871/2008-6), and by the authors's individual research  ... 
doi:10.1016/j.ins.2012.04.022 fatcat:7anafodyyvh4labbgxboxtmbkm

Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression [article]

David Mimno, Andrew McCallum
2012 arXiv   pre-print
In this paper we propose a Dirichlet-multinomial regression (DMR) topic model that includes a log-linear prior on document-topic distributions that is a function of observed features of the document, such  ...  We show that by selecting appropriate features, DMR topic models can meet or exceed the performance of several previously published topic models designed for specific data.  ...  Any opinions, findings and conclusions or recommendations expressed in this material are the authors' and do not necessarily reflect those of the sponsor.  ... 
arXiv:1206.3278v1 fatcat:cqmesheyvzaszf5xh46sxf7zmy

:{unav)

Andrew Kachites McCallum, Kamal Nigam, Jason Rennie, Kristie Seymore
2012 Information retrieval (Boston)  
Using these techniques, we have built a demonstration system: a portal for computer science research papers.  ...  Domain-specific internet portals are growing in popularity because they gather content from the Web and organize it for easy access, retrieval and search.  ...  Acknowledgements Most of the work in this paper was performed while all the authors were at Just Research. Kamal Nigam was supported in part by the DARPA HPKB program under contract F30602-97-1-0215.  ... 
doi:10.1023/a:1009953814988 fatcat:qgi4kdx75rcbrhjgjtz4sktr6e
« Previous Showing results 1 — 15 out of 133 results