368 Hits in 2.4 sec

Improving precision and recall for Soundex retrieval

D. Holmes, M.C. McCabe
Proceedings. International Conference on Information Technology: Coding and Computing  
We present a phonetic algorithm that fuses existing techniques and introduces new features. This combination offers improved precision and recall.  ...  We demonstrate fusion for improving the precision and recall of name searches, by combining Russell, Celko and Pfeifer techniques with our own.  ...  Misspellings, nicknames, phonetic and cultural variations complicate name-based information retrieval. The challenge is to improve recall without lowering precision.  ... 
doi:10.1109/itcc.2002.1000354 dblp:conf/itcc/HolmesM02 fatcat:kbyiwdwaxnavdgmo4rd3575dmm

SoundexGR: An algorithm for phonetic matching for the Greek language

Antrei Kavros, Yannis Tzitzikas
2022 Natural Language Engineering  
between precision and recall in datasets with different kinds of errors.  ...  For this reason, in this paper, we introduce an algorithm for phonetic matching designed for the Greek language: we start from the original Soundex and we redesign and extend it for accommodating the Greek  ...  The authors would like to thank Katerina Papantoniou for her feedback and for proof-reading the paper, and the anonymous reviewers for their fruitful comments and suggestions.  ... 
doi:10.1017/s1351324922000018 fatcat:b4bfnhrmfbcq3gkk24prfnehyq

Performance Evaluation of Phonetic Matching Algorithms on English Words and Street Names - Comparison and Correlation

Keerthi Koneru, Venkata Sai Venkatesh Pulla, Cihan Varol
2016 Proceedings of the 5th International Conference on Data Management Technologies and Applications  
Soundex is the first algorithm proposed and other algorithms like Metaphone, Caverphone, DMetaphone, Phonex etc., have been also used for information retrieval in different environments.  ...  Though Soundex has high accuracy in correcting the misspelled words compared to other algorithms, it has lower precision due to more noise in the considered arena.  ...  Evaluation Metrics The performance of phonetic matching algorithms used for information retrieval is evaluated by calculating precision, recall, and F -Measure.  ... 
doi:10.5220/0005926300570064 dblp:conf/data/KoneruPV16 fatcat:3tkjk644ljbbjpw2anmab6no4y

Inducing Search Keys for Name Filtering

Karl Branting
2007 Conference on Empirical Methods in Natural Language Processing  
ETK has the low computational cost and ability to filter by phonetic similarity characteristic of phonetic keys such as Soundex, but is adaptable to alternative similarity models.  ...  This paper describes ETK (Ensemble of Transformation-based Keys) a new algorithm for inducing search keys for name filtering.  ...  0.0468 ETK 0.3496 0.1687 0.2276 Table 4 : 4 Recall, precision, and f-measure for edit distance on U.S. surnames. recall precision f-measure BKT 1.0000 0.0024 0.0048 partition 1.0000 0.0106 0.0210 k=  ... 
dblp:conf/emnlp/Branting07 fatcat:56rkc5w6rvhrldjmhagsem5dfu

Phonetic string matching

Justin Zobel, Philip Dart
1996 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '96  
Our experimental comparison with existing techniques such as Soundex and edit distances, which is based on recall and precision, demonstrates that the new techniques are superior.  ...  In this paper we explain the parallels between information retrieval and phonetic matching, and describe our new phonetic matching techniques.  ...  We would also like to thank Ross Wilkinson and Hugh Williams. This work was supported by the Australian Research Council.  ... 
doi:10.1145/243199.243258 dblp:conf/sigir/ZobelD96 fatcat:5jxsqojmvfh5nm6dothh6vymki

Cross-language Phonetic Similarity Measure on Terms Appeared in Asian Languages

Ohnmar Htun, Shigeaki Kodama, Yoshiki Mikami
2011 International Journal of Intelligent Information Processing  
After evaluating the ratios of precision, recall, and F-measure, the results show that the proposed methodology successfully differentiates between phonetic and semantic groups by allocating the thresholds  ...  The results reported here prove that the proposed method has the potential to be applied to cross-language information retrieval and various linguistic studies.  ...  Evaluation Recall and precision are standard evaluation strategy for information retrieval. They can also be used extensively in the information retrieval literature.  ... 
doi:10.4156/ijiip.vol2.issue2.2 fatcat:323ofon6onddtbpbw5emioifry

On the development of name search techniques for Arabic

Syed Uzair Aqeel, Steve Beitzel, Eric Jensen, David Grossman, Ophir Frieder
2006 Journal of the American Society for Information Science and Technology  
Consequently, algorithms such as Soundex and n-gram matching are of limited utility for languages such as Arabic, which has a vastly different morphology that relies heavily on phonetic information.  ...  The need for effective identity matching systems has led to extensive research in the area of name search. For the most part, such work has been limited to English and other Latin-based languages.  ...  Aqeel 29 Similarly, further improved resilience to improper diacritic use would likely continue to yield improvements for Arabic name search systems.  ... 
doi:10.1002/asi.20323 fatcat:6xidlonm6nb73f7g4notguadfq

Phonetic Models for Generating Spelling Variants

Rahul Bhagat, Eduard H. Hovy
2007 International Joint Conference on Artificial Intelligence  
Our methods show threefold improvement over the baseline and generate four times as many good name variants compared to a human while maintaining a respectable precision of 0.68.  ...  Knowing the different variations can significantly improve the results of name-searches on various source texts, especially when recall is important.  ...  Andrew Philpot for providing the list of baby names with their variants and Dr. Patrick Pantel for his expert advice on evaluation.  ... 
dblp:conf/ijcai/BhagatH07 fatcat:wtvyss2cszbg5hx6qtc46ctuzm

How to Play the 'Names Game': Patent Retrieval Comparing Different Heuristics

Julio Raffo, Stephane Lhuillery
2009 Social Science Research Network  
Patent statistics represent a critical tool for scholars, statisticians and policy makers interested in innovation and intellectual property rights.  ...  Guidelines for researchers, TTOs, firms, venture capitalists and policy makers likely to implement a names game or to comment on results based on a names game are also provided.  ...  Acknowledgements We gratefully acknowledge the EPFL TTO and the EPFL Human Resources for data availability.  ... 
doi:10.2139/ssrn.1441172 fatcat:zxv3vovakbhpbofxgsffrajuwa

How to play the "Names Game": Patent retrieval comparing different heuristics

Julio Raffo, Stéphane Lhuillery
2009 Research Policy  
Patent statistics represent a critical tool for scholars, statisticians and policy makers interested in innovation and intellectual property rights.  ...  Guidelines for researchers, TTOs, firms, venture capitalists and policy makers likely to implement a names game or to comment on results based on a names game are also provided.  ...  Acknowledgements We gratefully acknowledge the EPFL TTO and the EPFL Human Resources for data availability.  ... 
doi:10.1016/j.respol.2009.08.001 fatcat:xbmowwojfbdq7mabqtpoirev4q

A comparative evaluation of name-matching algorithms

L. Karl Branting
2003 Proceedings of the 9th international conference on Artificial intelligence and law - ICAIL '03  
This paper proposes a three-stage framework for name matching, identifies how each stage in the framework addresses the naming variations that typically arise in legal cases, describes several alternative  ...  function that is both order-insensitive and tolerant of small numbers of omissions or additions; and compare names in a symmetrical, word-by-word fashion.  ...  If recall and precision are weighted equally, the F-measure is the harmonic mean of recall and precision: R P PR F + = 2 where P is precision and R is recall.  ... 
doi:10.1145/1047788.1047837 dblp:conf/icail/Branting03 fatcat:6t7gfj7h4jcmvei5xjvn6caxbu

How Does That Sound? Multi-Language SpokenName2Vec Algorithm Using Speech Generation and Deep Learning [article]

Aviad Elyashar, Rami Puzis, Michael Fire
2020 arXiv   pre-print
In most cases, users are aided by queries containing a name and sending back to the web search engines for finding their will.  ...  Utilizing the name pronunciation can be helpful for both differentiating and detecting names that sound alike, but are written differently.  ...  The same pattern is seen for F1, precision, and recall.  ... 
arXiv:2005.11838v2 fatcat:xlcpea3qijcpznoe3xzxrgk54y

Finding approximate matches in large lexicons

Justin Zobel, Philip Dart
1995 Software, Practice & Experience  
We propose methods for combining these techniques, and show experimentally that these combinations yield good retrieval effectiveness while keeping index size and retrieval time low.  ...  Approximate string matching is used for spelling correction and personal name matching.  ...  ACKNOWLEDGEMENTS This work was supported by the Australian Research Council, the Collaborative Information Technology Research Institute, and the Centre for Intelligent Decision Systems.  ... 
doi:10.1002/spe.4380250307 fatcat:tdxt6v4yzffbvi5iqrrzxevepy

Combining Word and Phonetic-Code Representations for Spoken Document Retrieval [chapter]

Alejandro Reyes-Barragán, Manuel Montes-y-Gómez, Luis Villaseñor-Pineda
2011 Lecture Notes in Computer Science  
Experimental results on the CLEF-CLSR-2007 corpus are encouraging; the proposed hybrid method improved the mean average precision and the number of retrieved relevant documents from the traditional word-based  ...  The traditional approach for spoken document retrieval (SDR) uses an automatic speech recognizer (ASR) in combination with a word-based information retrieval method.  ...  This work was done under partial support of CONACYT (project grant CB-2008-106013-Y, and scholarship 204467). We would also like to thank the CLEF organizing committee for the resources provided.  ... 
doi:10.1007/978-3-642-19437-5_38 fatcat:clcy62p5pfbmrilswcfwmwr5k4

A Case Study in Narne Matching

Ronald J. Leach
2006 Names  
The ne\v algorithm is easily automated and holds promise as a search technique for much larger data sets.  ...  The new algorithm had improved performance on the data set, achieving 92% success in name matching \vhen the best tvvo matches were used.  ...  The term "precision" is often used in conjunction with this type of name searching, especiall y in the information retrieval research community.  ... 
doi:10.1179/nam.2006.54.4.321 fatcat:i2lqbo2hz5bz3adsleirj63hpi
« Previous Showing results 1 — 15 out of 368 results