9,421 Hits in 10.6 sec

An alternative approach to natural language query expansion in search engines: Text analysis of non-topical terms in Web documents

Rahmatollah Fattahi, Concepción S. Wilson, Fletcher Cole
2008 Information Processing & Management  
This paper presents a new approach to query expansion in search engines through the use of general non-topical terms (NTTs) and domain-specific semi-topical terms (STTs).  ...  Our findings suggest that web searching could be greatly enhanced combining NTTs (and STTs) with TTs in an initial query.  ...  Acknowledgement This work was funded, in part, by the John Metcalfe Visitor's Grant awarded to the first author, Rahmatollah Fattahi.  ... 
doi:10.1016/j.ipm.2007.09.009 fatcat:w6ivv3h7ubc4tmcdljecknjbwe

Textual resource acquisition and engineering

J. Chu-Carroll, J. Fan, N. Schlaefer, W. Zadrozny
2012 IBM Journal of Research and Development  
Source acquisition is an iterative development process of acquiring new collections to cover salient topics deemed to be gaps in existing resources based on principled error analysis.  ...  However, the topic of source acquisition and engineering has received very little attention so far.  ...  **Trademark, service mark, or registered trademark of Jeopardy Productions, Inc., Wikimedia Foundation, Michael S. Hart, or Yahoo!, Inc., in the United States, other countries, or both.  ... 
doi:10.1147/jrd.2012.2185901 fatcat:rk7qmv7umjh3znxojvmf72smgu

Mobile Clustering Engine [chapter]

Claudio Carpineto, Andrea Della Pietra, Stefano Mizzaro, Giovanni Romano
2006 Lecture Notes in Computer Science  
An experimental evaluation, besides confirming that finding information is more difficult on a PDA than on a desktop computer, suggests that mobile clustering engine is more effective than mobile search  ...  Credino is probably the first clustering engine for mobile devices freely available for testing on the Web.  ...  Thanks to its mathematical properties, it supports various tasks of text analysis based on inter-document similarity, including query refinement, browsing retrieval, document ranking, and text mining  ... 
doi:10.1007/11735106_15 fatcat:6b4b66ms5fejni6b6ktyzjigcq

Health search engine with e-document analysis for reliable search results

Arnaud Gaudinat, Patrick Ruch, Michel Joubert, Philippe Uziel, Anne Strauss, Michèle Thonnet, Robert Baud, Stéphane Spahni, Patrick Weber, Juan Bonal, Celia Boyer, Marius Fieschi (+1 others)
2006 International Journal of Medical Informatics  
National Library of Medicine) and advanced approaches such as conclusion extraction from structured document, reformulation of the query, WRAPIN offers to the user a privileged access to navigate through  ...  KEYWORDS Web Search engine; Trustworthy information; eHealth Summary Objective: After a review of the existing practical solution available to the citizen to retrieve eHealth document, the paper describes  ...  The authors wish to thank all the WRAPIN members as well as the WRAPIN tester with their precious contribution in this ambitious project.  ... 
doi:10.1016/j.ijmedinf.2005.11.002 pmid:16377235 fatcat:tukdztrzkzbflcnihiqbbdblgu

Search Engines Going beyond Keyword Search: A Survey

Mahmudur Rahman
2013 International Journal of Computer Applications  
In order to solve the problem of information overkill on the web or large domains, current information retrieval tools especially search engines need to be improved.  ...  This paper tries to identify the major challenges for today's keyword search engines to adapt with the fast growth of web and support comprehensive user demands in quick time.  ...  [10] presents ConceptWorld, an instrument to automatically discover various facets of a topic of interest by extracting concepts from Web documents.  ... 
doi:10.5120/13200-0357 fatcat:6e44oj4fwjfo5ljhxzgv27jzmm

Linggle: a Web-scale Linguistic Search Engine for Words in Context

Joanne Boisson, Ting-hui Kao, Jian-Cheng Wu, Tzu-Hsi Yen, Jason S. Chang
2013 Annual Meeting of the Association for Computational Linguistics  
In this paper, we introduce a Web-scale linguistics search engine, Linggle, that retrieves lexical bundles in response to a given query.  ...  We plan to extend Linggle to provide fast and convenient access to a wealth of linguistic information embodied in Web scale datasets including Google Web 1T and Google Books Ngram for many major languages  ...  Related work Web-scale Linguistic Search Engine (LSE) has been an area of active research. Recently, the state-of-the-art in LSE research has been re-viewed in Fletcher (2012) .  ... 
dblp:conf/acl/BoissonKWYC13 fatcat:w3vabe3v7rdopia2piccajxbme

A knowledge-based search engine powered by wikipedia

David N. Milne, Ian H. Witten, David M. Nichols
2007 Proceedings of the sixteenth ACM conference on Conference on information and knowledge management - CIKM '07  
Koru exhibits an understanding of the topics of both queries and documents. This allows it to (a) expand queries automatically and (b) help guide the user as they evolve their queries interactively.  ...  It was capable of lending assistance to almost every query issued to it; making their entry more efficient, improving the relevance of the documents they return, and narrowing the gap between expert and  ...  an alternative title of an article with the preferred one.  ... 
doi:10.1145/1321440.1321504 dblp:conf/cikm/MilneWN07 fatcat:w3z4nghktbg5bp3a6fyejxsbe4

Introducing lateral thinking in search engines

Yann Landrin-Schweitzer, Pierre Collet, Evelyne Lutton
2006 Genetic Programming and Evolvable Machines  
This common statement applies to huge databases, where state of the art search engines may retrieve hundreds of very similar documents for a precise query.  ...  User queries are rewritten with the help of the user profile, then the database is searched with the set of rewritten queries and presented to the user as a list of documents, in the same way as any usual  ...  Acknowledgements The authors are grateful to Thierry Prost, MD, PhD, for his educated analysis of the innards of ELISE.  ... 
doi:10.1007/s10710-006-7008-z fatcat:d2wiyfshhbgtnhiqo5buupovqi

Building Better Search Engines by Measuring Search Quality

Ellen M. Voorhees, Paul Over, Ian Soboroff
2014 IT Professional Magazine  
Origins of TREC Today we take search for text documents in our native language for granted, but web search engines such as Yahoo, Google, and Bing were not built in a day, nor is web content the only area  ...  Search engines are developed using standard sets of realistic test cases that allow developers to measure the relative effectiveness of alternative approaches.  ...  document set and a natural language statement of an information need, called a topic.  ... 
doi:10.1109/mitp.2013.105 fatcat:hk3zocjbxjawhfuye4k7gkcvqq

A survey of Web clustering engines

Claudio Carpineto, Stanislaw Osiński, Giovanni Romano, Dawid Weiss
2009 ACM Computing Surveys  
Web clustering engines organize search results by topic, thus offering a complementary view to the flat-ranked list returned by conventional search engines.  ...  We discuss how to compare the retrieval effectiveness of a clustering engine to that of a plain search engine and how to compare different clustering engines, including an experimental study of the subtopic  ...  ACKNOWLEDGMENTS We are very grateful to three anonymous reviewers for their constructive suggestions and comments.  ... 
doi:10.1145/1541880.1541884 fatcat:e3ndkaq6ovhe3ep6kkamrtol3e

Analyzing and mining a code search engine usage log

Sushil Krishna Bajracharya, Cristina Videira Lopes
2010 Empirical Software Engineering  
When compared to Web search, search behavior in Koders showed many similar patterns. A topic modeling analysis of the usage data shows what topics users of Koders are looking for.  ...  It also provides several suggestions for improvements in code search engines based on the analysis of usage, topics, and query forms.  ...  We thank Andi Zink (VP of Engineering, Black Duck Software) for his efforts in making the Koders usage log public.  ... 
doi:10.1007/s10664-010-9144-6 fatcat:2muoijlpkrfi3lpws3zd3yjqbu

Automatic query reformulations for text retrieval in software engineering

Sonia Haiduc, Gabriele Bavota, Andrian Marcus, Rocco Oliveto, Andrea De Lucia, Tim Menzies
2013 2013 35th International Conference on Software Engineering (ICSE)  
We evaluated Refoqus empirically against four baseline approaches that are used in natural language document retrieval.  ...  Refoqus outperformed the baselines and its recommendations lead to query performance improvement or preservation in 84% of the cases (in average).  ...  Such approaches are designed to work for natural language documents as they rely on word relationships that exist in natural language.  ... 
doi:10.1109/icse.2013.6606630 dblp:conf/icse/HaiducBMOLM13 fatcat:563xkgy6gzedvaqsmdzxez2nxq

Combining text and link analysis for focused crawling—An application for vertical search engines

G. Almpanidis, C. Kotropoulos, I. Pitas
2007 Information Systems  
In this paper, we develop a latent semantic indexing classifier that combines link analysis with text content in order to retrieve and index domain-specific web documents.  ...  The number of vertical search engines and portals has rapidly increased over the last years, making the importance of a topic-driven (focused) crawler self-evident.  ...  We would like to thank Mr. Athanasios Papaioannou for his contributions in the implementation of the PLSI algorithm.  ... 
doi:10.1016/ fatcat:pqcpgeyu4revtavnhmz4ger2ju

Essie: A Concept-based Search Engine for Structured Biomedical Text

N. C. Ide, R. F. Loane, D. Demner-Fushman
2007 JAMIA Journal of the American Medical Informatics Association  
Essie's design is motivated by an observation that query terms are often conceptually related to terms in a document, without actually occurring in the document text.  ...  Essie shows that a judicious combination of exploiting document structure, phrase searching, and concept based query expansion is a useful approach for information retrieval in the biomedical domain.  ...  Background The Essie search engine was originally developed in 2000 at the National Library of Medicine to support, 5, 6 an online registry of clinical research studies.  ... 
doi:10.1197/jamia.m2233 pmid:17329729 pmcid:PMC2244877 fatcat:l3dg4ofxtnakbdhijktzci4p5i

A functionality taxonomy for document search engines [article]

Rik D.T. Janssen, Henderik A. Proper
2021 arXiv   pre-print
In this paper a functionality taxonomy for document search engines is proposed.  ...  We use the word 'search engine' in the broadest sense possible, including library and web based (meta) search engines.  ...  The taxonomy in this paper may also be viewed as the starting point of an architecture for an open and standardized search infrastructure.  ... 
arXiv:2105.12989v1 fatcat:5y5axtbayrdl3cm6nzerry6pwm
« Previous Showing results 1 — 15 out of 9,421 results