22,790 Hits in 3.0 sec

Generalized substring selectivity estimation

Zhiyuan Chen, Flip Korn, Nick Koudas, S. Muthukrishnan
2003 Journal of computer and system sciences (Print)  
We present a novel approach to selectivity estimation for generalized Boolean substring queries with a focus on the two cases of (1) conjunctive multidimensional and (2) Boolean queries.  ...  Selectivity estimation for generalized Boolean queries has not been studied previously; our own prior work, which is discussed and extended herein, applies to the case of onedimensional Boolean queries  ...  Acknowledgments We thank Divesh Srivastava for thought-provoking discussions, and for supplying code and data sets for our experiments.  ... 
doi:10.1016/s0022-0000(02)00031-4 fatcat:dsrih3esffflbgnxs4sskyzi7y

Research on Information Retrieval System that Supports Keyword Selection based on Generalized Concept and Coverage

2005 Transactions of the Japanese society for artificial intelligence  
We proposed a new IR system called "appropriate Boolean query reformulation for IR with adaptive generalization" (ABRIR-AG) to support Boolean query formation.  ...  Therefore, we used a thesaurus for query expansion.  ...  We proposed a new IR system called "appropriate Boolean query reformulation for IR with adaptive generalization" (ABRIR-AG) to support Boolean query formation.  ... 
doi:10.1527/tjsai.20.270 fatcat:jtla3clrzratjko4hf4xzeqxtq

Page 3295 of Mathematical Reviews Vol. , Issue 2004d [page]

2004 Mathematical Reviews  
We present a novel approach to selectivity estimation for generalized Boolean substring queries with a focus on the two cases of (1) conjunctive multidimensional queries and (2) Boolean queries.  ...  Selectivity estimation for gen- eralized Boolean queries has not been studied previously; our own prior work, which is discussed and extended herein, applies to the case of one-dimensional Boolean queries  ... 

Ranked queries over sources with Boolean query interfaces without ranking support

Vagelis Hristidis, Yuheng Hu, Panagiotis G. Ipeirotis
2010 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010)  
For instance, PubMed allows users to submit highly expressive Boolean keyword queries, but ranks the query results by date only.  ...  In this paper we present algorithms that return the top results for a query, ranked according to an IR-style ranking function, while operating on top of a source with a Boolean query interface with no  ...  Therefore, we can estimate the benefit of a query q, defined as the probability that a randomly selected document from the answer of q will have score higher than the k-th ranked score for Q among the  ... 
doi:10.1109/icde.2010.5447918 dblp:conf/icde/HristidisHI10 fatcat:vzjsd5jzfjhk5h3aphrfxwnf3u

Minimizing the Average Query Complexity of Learning Monotone Boolean Functions

Vetle I. Torvik, Evangelos Triantaphyllou
2002 INFORMS journal on computing  
This paper addresses the problem of completely reconstructing deterministic monotone Boolean functions via membership queries.  ...  A framework for unbiased average case comparison of monotone Boolean function inference algorithms is developed using unequal probability sampling.  ...  An Unbiased Estimator for the Average Case Complexity Consider the finite universe of fixed quantities {I 1 , Boolean functions might be selected more than once, while their corresponding quantity is used  ... 
doi:10.1287/ijoc. fatcat:eols7xq2dreyzclr3chig2uwiq

Comparing Boolean and probabilistic information retrieval systems across queries and disciplines

Robert M. Losee
1997 Journal of the American Society for Information Science  
Using these performance predicting techniques, sample performance figures are provided for queries using the Boolean and and or, as well as for probabilistic systems assuming statistical term independence  ...  , given values for query and database characteristics.  ...  Given this knowledge, the better search engine for a particular query and database combination can be selected.  ... 
doi:10.1002/(sici)1097-4571(199702)48:2<143::aid-asi5>;2-y fatcat:l6vk37ddvnbyppua4fkqfyhrhi

LCA-based selection for XML document collections

Georgia Koloniari, Evaggelia Pitoura
2010 Proceedings of the 19th international conference on World wide web - WWW '10  
We address both a boolean and a weighted version of the database selection problem.  ...  In this paper, we address the problem of database selection for XML document collections, that is, given a set of collections and a user query, how to rank the collections based on their goodness to the  ...  Selectivity Estimation for XML Documents: Summaries for XML documents have also been used to provide selectivity estimations for queries against XML documents.  ... 
doi:10.1145/1772690.1772743 dblp:conf/www/KoloniariP10 fatcat:mttb2juhrneezafc4weymrsn4m

Unbiased estimation of size and other aggregates over hidden web databases

Arjun Dasgupta, Xin Jin, Bradley Jewell, Nan Zhang, Gautam Das
2010 Proceedings of the 2010 international conference on Management of data - SIGMOD '10  
We propose novel techniques which use a small number of queries to produce unbiased estimates with small variance.  ...  These techniques can also be used for approximate query processing over hidden databases. We present theoretical analysis and extensive experiments to illustrate the effectiveness of our approach.  ...  In particular, consider a Boolean database. The sampler starts with query SELECT * FROM D.  ... 
doi:10.1145/1807167.1807259 dblp:conf/sigmod/DasguptaJJZD10 fatcat:y7mnz5exj5bnjozqsjeevuxmru

Guided inference of nested monotone Boolean functions

V Torvik
2003 Information Sciences  
The most efficient known approach to minimizing the average query complexity in inferring a single monotone Boolean function is based on a query selection criterion.  ...  It is shown that the selection criterion approach is easily modified for use with restricted oracles.  ...  Fig. 10 shows the average number of queries for the three problems when selection criteria are used. The Horvitz-Thompson [8] estimator was used to compute the averages for n greater than 4.  ... 
doi:10.1016/s0020-0255(03)00062-8 fatcat:s2c7nbxaqfbl3hte74npl2axyi

Query Variation Performance Prediction for Systematic Reviews

Harrisen Scells, Leif Azzopardi, Guido Zuccon, Bevan Koopman
2018 The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval - SIGIR '18  
in the query selection process.  ...  Given the possible query variations that they could construct, selecting the best performing query is diicult.  ...  Unlike the traditional Query Performance Prediction (QPP) task [5] , where the performance of queries across diferent topics is estimated, QVPP attempts to estimate the performance of queries for the  ... 
doi:10.1145/3209978.3210078 dblp:conf/sigir/ScellsAZK18 fatcat:xnzvnhijojcwje5cmjcvz2hwee

Cost based plan selection for xpath

Haris Georgiadis, Minas Charalambides, Vasilis Vassalos
2009 Proceedings of the 35th SIGMOD international conference on Management of data - SIGMOD '09  
An important part of the framework is PSA, a very efficient cost-based plan selection algorithm for XPath queries.  ...  In the presented experimental evaluation, PSA picked the cheapest estimated query plan in 100% of the cases.  ...  Cardinality and partial selectivity estimation for vf and Ъvf respectively involve the DistinctValues() method, similarly to selectivity estimation of selection operators in relational systems.  ... 
doi:10.1145/1559845.1559909 dblp:conf/sigmod/GeorgiadisCV09 fatcat:q2eje55ekrgzrkrk2yqllsoocq

Cheshire II: Designing a next-generation online catalog

Ray R. Larson, Jerome McDonough, Paul O'Leary, Lucy Kuntz, Ralph Moon
1996 Journal of the American Society for Information Science  
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE. 47 (7) :555-567,  ...  It is based on a number of national and international standards for data description, communication, and interface technology.  ...  The authors also thank the anonymous reviewers for helpful suggestions on improving this article.  ... 
doi:10.1002/(sici)1097-4571(199607)47:7<555::aid-asi7>;2-t fatcat:oux75vpq5jb5rhtryz7dvgao3a

A novel method for the evaluation of Boolean query effectiveness across a wide operational range

Eero Sormunen
2000 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '00  
A new laboratory-based evaluation method for Boolean IR systems is proposed.  ...  Traditional methods for the system-oriented evaluation of Boolean IR systems suffer from validity and reliability problems.  ...  An estimate for maximum precision across the whole relative recall range was determined by applying a simple incremental algorithm: 1.  ... 
doi:10.1145/345508.345541 dblp:conf/sigir/Sormunen00 fatcat:g5tgw4ealba5ncbfxe7oifnclu

Automatic boolean query suggestion for professional search

Youngho Kim, Jangwon Seo, W. Bruce Croft
2011 Proceedings of the 34th international ACM SIGIR conference on Research and development in Information - SIGIR '11  
Recent surveys have also verified that professional searchers continue to have a strong preference for Boolean queries because they provide a record of what documents were searched.  ...  To support this type of professional search, we propose a novel Boolean query suggestion technique.  ...  Boolean Query Quality Predictors are features with the purpose of estimating Boolean query quality.  ... 
doi:10.1145/2009916.2010026 dblp:conf/sigir/KimSC11 fatcat:nmiyrne52bhhlmwppk32dh5lvm

Similarity Driven Approximation for Text Analytics [article]

Guangyan Hu, Yongfeng Zhang, Sandro Rigo, Thu D. Nguyen
2020 arXiv   pre-print
For example, when sampling at 10%, EmApprox speeds up a set of queries counting phrase occurrences by almost 10x while achieving estimated relative errors of less than 22% for 90% of the queries.  ...  Then, at query processing time, EmApprox uses the index to guide sampling of the data set, with the probability of selecting each subcollection of documents being proportional to its similarity to the  ...  EmApprox can facilitate subcollection selection under the vector space retrieval paradigm for DIR [18] , [19] . We target Boolean and ranked retrieval models for DIR in the following discussion.  ... 
arXiv:1910.07144v2 fatcat:yk7wywjrdnegbixwaeho2m562u
« Previous Showing results 1 — 15 out of 22,790 results