Filters








720 Hits in 3.0 sec

Managing misspelled queries in IR applications

Jesús Vilares, Manuel Vilares, Juan Otero
2011 Information Processing & Management  
A first strategy involves those approaches based on correcting the misspelled query, thus requiring the integration of linguistic information in the system.  ...  The results obtained demonstrate that stemming-based approaches are highly sensitive to misspelled queries, particularly with short queries.  ...  Alonso, of Univ. of A Coruña (Spain), and the reviewers for their helpful comments and suggestions in order to improve this article.  ... 
doi:10.1016/j.ipm.2010.08.004 fatcat:kn3pxgjxpneffjuhys24qdslxe

Corrupted queries in Spanish text retrieval

Juan Otero, Jesús Vilares, Manuel Vilares Ferro
2008 Proceeding of the 2nd ACM workshop on Improving non english web searching - iNEWS '08  
In this paper, we propose and evaluate two different alternatives to deal with degraded queries on Spanish ir applications.  ...  In order to study their validity, a testing framework has been formally designed and applied on both approaches.  ...  In this sense, most authors directly apply error correction techniques on lexical forms in order to provide ir tools with a robust querying facility.  ... 
doi:10.1145/1460027.1460034 dblp:conf/cikm/OteroVF08 fatcat:5nnwrqrcufgofje6csapkiique

Studying the effect and treatment of misspelled queries in Cross-Language Information Retrieval

Jesús Vilares, Miguel A. Alonso, Yerai Doval, Manuel Vilares
2016 Information Processing & Management  
In contrast with their monolingual counterparts, little attention has been paid to the effects that misspelled queries have on the performance of Cross-Language Information Retrieval (CLIR) systems.  ...  The present work makes a first attempt to fill this gap by extending our previous work on monolingual retrieval in order to study the impact that the progressive addition of misspellings to input queries  ...  in the same language) that can be managed by classic IR systems.  ... 
doi:10.1016/j.ipm.2015.12.010 fatcat:d4wd4nvpd5caxlopqrtlti3jhy

Integrating Query of Relational and Textual Data in Clinical Databases: A Case Study

J. M. Fisk, P. Mutalik, F. W. Levin, J. Erdos, C. Taylor, P. Nadkarni
2003 JAMIA Journal of the American Medical Informatics Association  
A b s t r a c t Objectives: The authors designed and implemented a clinical data mart composed of an integrated information retrieval (IR) and relational database management system (RDBMS).  ...  Conclusion: A robust IR+RDBMS system can be developed, but it requires integrating RDBMSs with third-party IR software.  ...  Currently, however, the vendors have just barely managed to get IR functionality in place, and even this functionality is sometimes incomplete.  ... 
doi:10.1197/jamia.m1133 pmid:12509355 pmcid:PMC150357 fatcat:tbyuwargonaehenagmmfufphtq

Spelling Correction for Search Engine Queries [chapter]

Bruno Martins, Mário J. Silva
2004 Lecture Notes in Computer Science  
However, recent studies show misspelled words are very common in queries to these systems. When users misspell query, the results are incorrect or provide inconclusive information.  ...  In this work, we discuss the integration of a spelling correction component into tumba!, our community Web search engine.  ...  Terminology Information Retrieval (IR) concerns with the problem of providing relevant documents in response to a user's query [2] .  ... 
doi:10.1007/978-3-540-30228-5_33 fatcat:22lzm53aovclnpwyuvpebtrjpe

Information access in the presence of OCR errors

Kazem Taghva, Thomas Nartker, Julie Borsack
2004 Proceedings of the 1st ACM workshop on Hardcopy document processing - HDP '04  
In this paper, we will highlight our findings and detail our current activities.  ...  Over the last 15 years, the Information Science Research Institute (ISRI) at the University of Nevada, Las Vegas (UNLV) has conducted information access research in the presence of OCR errors.  ...  Effects on IR Models A boolean system relies on the presence of query terms within documents to determine relevance.  ... 
doi:10.1145/1031442.1031443 fatcat:3zoiskjkibcq3mcfb7bsv5qrua

MANICURE document processing system

Kazem Taghva, Allen Condit, Julie Borsack, John Kilburg, Changshi Wu, Jeff Gilbreth, Daniel P. Lopresti, Jiangying Zhou
1998 Document Recognition V  
In this paper the functionalties supported by MANICURE and their implementations are described.  ...  In particular, we provide information on specific modules dealing with automatic detection and correction of OCR errors and automatic markup of logical components of the text.  ...  In this capacity, depending on the application, requirements on accuracy and text structure vary.  ... 
doi:10.1117/12.304631 dblp:conf/drr/TaghvaCBKWG98 fatcat:6di4ifyukzfddpcyestp362jxa

Assisting the searcher: utilizing software agents for Web search systems

Bernard J. Jansen, Udo Pooch
2004 Internet Research  
Keywords : software agents, information retrieval, user evaluation of information retrieval (IR) systems and users, including improper query formulation, ineffectiveness in Jansen, B. 2 expanding results  ...  Brajnik, Guida, and Tasso (1987) implemented an adaptive IR interface that utilized natural language queries.  ...  Spelling: Searchers routinely misspell terms in queries (Yee, 1991) , which usually drastically reduces the number of results retrieved.  ... 
doi:10.1108/10662240410516291 fatcat:ptdhkig4ebcv5kgx6yzrswpe2q

DB&IR integration

Sihem Amer-Yahia, Djoerd Hiemstra, Thomas Roelleke, Divesh Srivastava, Gerhard Weikum
2008 SIGMOD record  
The seminar title was interpreted in an IR-style "andish" sense (it covered also subsets of {Ranking, XML, Querying}, with larger sets being favored) rather than the DB-style strictly conjunctive manner  ...  This paper is based on a five-day workshop on "Ranked XML Querying" that took place in Schloss Dagstuhl in Germany in March 2008 and was attended by 27 people from three different research communities:  ...  However, there are now many applications that require managing both structured and unstructured data and thus mandate serious consideration on how to integrate the DB and IR worlds at both foundational  ... 
doi:10.1145/1462571.1462584 fatcat:ig7lugasfvbxhfcnhbyn2x5tli

Simultaneous multilingual search for translingual information retrieval

Kristen Parton, Kathleen R. McKeown, James Allan, Enrique Henestroza
2008 Proceeding of the 17th ACM conference on Information and knowledge mining - CIKM '08  
We show how close integration of CLIR and SMT allows us to improve result translation in addition to IR results. Figure 4. Query-directed statistical machine translation post-editing.  ...  We present a framework for translingual IR that integrates document translation and query translation into the retrieval model.  ...  Using SMLIR, only a single index has to be managed, and each query results in one IR query, whose results can be returned without further processing. The query time for both methods is comparable.  ... 
doi:10.1145/1458082.1458179 dblp:conf/cikm/PartonMAH08 fatcat:2np6o7n55ndgdg6u6fck5rsz3q

Efficient NN Spatial Keyword Search Using Spatial Inverted (SI) Index

B.A. Vishnupriya, N. Senthamarai, S. Bharathi
2018 International Journal of Engineering & Technology  
away in spatial databases.  ...  On the off chance that the EVR is absent in the two file structure of intermediary it offer question to LBS.  ...  If the query given by the user is misspelled or with typo error, this type of fuzzy keyword is manage by n gram/2L Approximate inverted index.  ... 
doi:10.14419/ijet.v7i2.19.12101 fatcat:fds6coagbnbltfbafkzviiebzi

Using word embeddings to expand terminology of dietary supplements on clinical notes

Yadan Fan, Serguei Pakhomov, Reed McEwan, Wendi Zhao, Elizabeth Lindemann, Rui Zhang
2019 JAMIA Open  
The increasing corpus size results in more misspellings, but not more semantic variants brand names. Word2vec model is also found more capable of detecting semantically similar terms than GloVe.  ...  We propose that this method can be potentially applied to create a DS vocabulary for downstream applications, such as information extraction.  ...  Since this query expansion is not involved in an IR system, no relevance related to the identified notes is evaluated. We described the experiments in the following two tasks.  ... 
doi:10.1093/jamiaopen/ooz007 pmid:31825016 pmcid:PMC6904105 fatcat:ykwubxeq6vbvre6mtumodmatxm

NASA indexing benchmarks: evaluating text search engines

Sandra L. Esler, Michael L. Nelson
1997 Journal of Network and Computer Applications  
The current proliferation of on-line information resources underscores the requirement for the ability to index collections of information and search and retrieve them in a convenient manner.  ...  These efficiently manage memory needed to index large collections.  ...  , discovery of misspelled words, etc.) but has the slowest query times which average roughly .40 seconds than each SWISH query.  ... 
doi:10.1006/jnca.1997.0049 fatcat:72e5dvchjrcdxgnvegnfjnrbou

Managing syntactic variation in text retrieval

Jesús Vilares, Carlos Gómez-Rodríguez, Miguel A. Alonso
2005 Proceedings of the 2005 ACM symposium on Document engineering - DocEng '05  
Two different sources of syntactic information, queries and documents, are studied in order to increase the performance of Information Retrieval systems.  ...  In this paper we deal with European languages, taking Spanish as a case in point.  ...  Shallow parsing has shown itself to be useful in several NLP application fields, particularly in Information Extraction [7] , although its application in IR has not yet been studied in depth.  ... 
doi:10.1145/1096601.1096643 dblp:conf/doceng/VilaresGA05 fatcat:z6khtwza25ca3pqnhayivefyau

Effects of OCR errors on ranking and feedback using the vector space model

Kazem Taghva, Julie Borsack, Allen Condit
1996 Information Processing & Management  
In particular, we observed that cosine normalization plays a considerable role in the disparity seen between the collections.  ...  We report on the performance of the vector space model in the presence of OCR errors.  ...  TREC experiments showed similar problems with misspellings in their collections.  ... 
doi:10.1016/0306-4573(95)00058-5 fatcat:tqd2ghcrerf3zml7guqsjhp25y
« Previous Showing results 1 — 15 out of 720 results