Filters








16,351 Hits in 6.5 sec

Circumventing Data Quality Problems Using Multiple Join Paths

Yannis Kotidis, Amélie Marian, Divesh Srivastava
2006 Clean Database  
MJP associates quality scores with candidate answers by first scoring individual data paths between a pair of field values taking into account data quality with respect to specified integrity constraints  ...  We address the problem of finding the top-few (highest quality) answers in the MJP framework using novel techniques, and demonstrate the utility of our techniques using real data and our Virtual Integration  ...  Efficiency of Top-k Evaluation We use the number of probes to the applications to determine the efficiency of top-k evaluation.  ... 
dblp:conf/cleandb/KotidisMS06 fatcat:xpmuzzjhjfg6dlqfsrjb7j27iu

FedSearch: Efficiently Combining Structured Queries and Full-Text Search in a SPARQL Federation [chapter]

Andriy Nikolov, Andreas Schwarte, Christian Hütter
2013 Lecture Notes in Computer Science  
However, executing hybrid search queries in a federation of multiple data sources presents a number of challenges due to data source heterogeneity and lack of statistical data about keyword selectivity  ...  By performing on-the-fly adaptation of the query execution plan and intelligent grouping of query clauses, we are able to reduce significantly the communication costs making our approach suitable for top-k  ...  For processing top-k queries and hybrid queries in particular, however, this mechanism is insufficient for several reasons: -Optimal scheduling of remote requests can differ for top-k queries and queries  ... 
doi:10.1007/978-3-642-41335-3_27 fatcat:tt7gztkmsjg5zbypo2rbgf45ta

Revisiting Pipelined Parallelism in Multi-Join Query Processing

Bin Liu, Elke A. Rundensteiner
2005 Very Large Data Bases Conference  
Multi-join queries are the core of any integration service that integrates data from multiple distributed data sources.  ...  Due to the large number of data sources and possibly high volumes of data, the evaluation of multi-join queries faces increasing scalability concerns.  ...  Paul Larson from Microsoft Research Lab for his feedback on the initial idea of this paper. We thank all WPI DSRG members for their useful comments.  ... 
dblp:conf/vldb/LiuR05 fatcat:nrylbigchjgdbpwhfjrj5aq2hi

Ontology-Based Top-k Query Answering over Massive, Heterogeneous, and Dynamic Data

Daniele Dell'Aglio
2013 International Semantic Web Conference  
In my research activity I will study the problem of computing the top k relevant items given a collection of data sets with both streaming and static data, an ontology describing them, and a set of top-k  ...  data providers (e.g., Amadeus), I consider this process as executed by a black box system to highlight the features of the data.  ...  Related work The raising of data stream sources introduced new problems about how to manage, process and query infinite sequences of data with high frequency rate.  ... 
dblp:conf/semweb/DellAglio13 fatcat:7oxyb3p45zcqpn62zkro2edjue

Processing top-k join queries

Minji Wu, Laure Berti-Équille, Amélie Marian, Cecilia M. Procopiuc, Divesh Srivastava
2010 Proceedings of the VLDB Endowment  
We consider the problem of efficiently finding the top-k answers for join queries over web-accessible databases.  ...  We evaluate our algorithms on a variety of queries and data sets and demonstrate the significant benefits they provide.  ...  In the typical top-k query model, the score of each object is computed based on a number of attributes stored at data sources.  ... 
doi:10.14778/1920841.1920951 fatcat:vil4gwmqtrd6rmi6lxr4hq44se

Elastic and Scalable Processing of Linked Stream Data in the Cloud [chapter]

Danh Le-Phuoc, Hoan Nguyen Mau Quoc, Chan Le Van, Manfred Hauswirth
2013 Lecture Notes in Computer Science  
It enables the integration and joint processing of heterogeneous stream data with quasi-static data from the Linked Data Cloud in near-real-time.  ...  Several Linked Stream Data processing engines exist but their scalability still needs to be in improved in terms of (static and dynamic) data sizes, number of concurrent queries, stream update frequencies  ...  Examples of such data sources include sensors, embedded systems, mobile devices, Twitter, and social networks, with a steep, exponential growth predicted in the number of sources and the amount of data  ... 
doi:10.1007/978-3-642-41335-3_18 fatcat:w4ogqbov6ffctm3lp6hrsdfwja

Minimal probing

Kevin Chen-Chuan Chang, Seung-won Hwang
2002 Proceedings of the 2002 ACM SIGMOD international conference on Management of data - SIGMOD '02  
This paper addresses the problem of evaluating ranked topqueries with expensive predicates.  ...  We then propose Algorithm MPro which, by implementing the principle, is provably optimal with minimal probe cost. Further, we show that MPro can scale well and can be easily parallelized.  ...  Acknowledgements: We thank Divyakant Agrawal and Wen-Syan Li for their fruitful discusions during one of the authors' summer visit at NEC USA CCRL, which inspired us to pursue this work.  ... 
doi:10.1145/564691.564731 dblp:conf/sigmod/ChangH02 fatcat:vkvovhdlpfdh7hrdkmvcfjjuhu

Minimal probing

Kevin Chen-Chuan Chang, Seung-won Hwang
2002 Proceedings of the 2002 ACM SIGMOD international conference on Management of data - SIGMOD '02  
This paper addresses the problem of evaluating ranked topqueries with expensive predicates.  ...  We then propose Algorithm MPro which, by implementing the principle, is provably optimal with minimal probe cost. Further, we show that MPro can scale well and can be easily parallelized.  ...  Acknowledgements: We thank Divyakant Agrawal and Wen-Syan Li for their fruitful discusions during one of the authors' summer visit at NEC USA CCRL, which inspired us to pursue this work.  ... 
doi:10.1145/564728.564731 fatcat:tivkvliplzaynjpipkgmohxc6u

Database research at the University of Illinois at Urbana-Champaign

M. Winslett, K. Chang, A. Doan, J. Han, C. Zhai, Y. Zhou
2002 SIGMOD record  
at Argonne Na- The MPro project addresses the problem of evaluating ranked top-k queries with expensive predicates.  ...  Hwang, Minimal Probing: Supporting Expensive Predicates for Top-k Queries, Proceedings of the 2002 ACM SIGMOD Conference, Madison, Wisconsin, June 2002.  ... 
doi:10.1145/601858.601881 fatcat:ff6mvu2aorherel6rvb46zj6hm

Collaborative Computation of Top-K Request Processing Over Unpredictable Data

Kopuru Anusri
2017 International Journal for Research in Applied Science and Engineering Technology  
This script tackles the dispute of processing top-K queries over unpredictable data by stream sourcing for hastily converging to the real ordering of proper emanates.  ...  Querying unpredictable data has turn into a popular letter for the sake of the reproduction of user-generated composition from societal communications and of data streams from sensors.  ...  We draw up the trouble of Uncertainty Resolution (UR) in the text of top-K doubt processing over unpredictable data with congest responsibility.  ... 
doi:10.22214/ijraset.2017.10328 fatcat:og4fo6nvvzfrzkoow5zyjhnzoq

Deadlock-free joins in DB-mesh, an asynchronous systolic array accelerator

Bingyi Cao, Kenneth A. Ross, Stephen A. Edwards, Martha A. Kim
2017 Proceedings of the 13th International Workshop on Data Management on New Hardware - DAMON '17  
Previous database accelerator proposals such as the Q100 provide a fixed set of database operators, chosen to support a target query workload.  ...  DB-Mesh is an asynchronous systolic array that is more generic than the Q100, and can be configured to run a variety of operators with configurable parameters such as record widths.  ...  Moreover, the operations are happening in parallel, with multiple records in a single table processed in a pipeline, and multiple tables processed in separate parallel instances of the accelerators.  ... 
doi:10.1145/3076113.3076118 dblp:conf/damon/CaoREK17 fatcat:fyt6jnrnzbgmxb5uknvj2xkepe

Probe Minimization by Schedule Optimization: Supporting Top-K Queries with Expensive Predicates

Seung-won Hwang, Kevin Chen-Chuan Chang
2007 IEEE Transactions on Knowledge and Data Engineering  
This paper addresses the problem of evaluating ranked top-k queries with expensive predicates.  ...  Index Terms-Database query processing, distributed information systems, database systems. 646  ...  This paper is based on and significantly extends our preliminary works "Minimal Probing: Supporting Expensive Predicates for Top-k Queries" in the Proceedings of the ACM SIGMOD 2002.  ... 
doi:10.1109/tkde.2007.1007 fatcat:lbiy7p24jjd7lfinarwhs56avy

RankReduce - Processing K-Nearest Neighbor Queries on Top of MapReduce

Aleksandar Stupar, Sebastian Michel, Ralf Schenkel
2010 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval  
We consider the problem of processing K-Nearest Neighbor (KNN) queries over large datasets where the index is jointly maintained by a set of machines in a computing cluster.  ...  K-Nearest Neighbors.  ...  We measured the number of mappers per query and precision for 50 KNN queries, with K=20, for both datasets, with 2GB, 4GB, and 8GB of indexed data (∼4000, ∼8000, and ∼16000 feature vectors for the real  ... 
dblp:conf/sigir/StuparMS10 fatcat:fo6xocjhk5bjznlkinossf2pme

Sharing work in keyword search over databases

Marie Jacob, Zachary Ives
2011 Proceedings of the 2011 international conference on Management of data - SIGMOD '11  
The ATC manages the flow of tuples among a multitude of pipelined operators, minimizing the work needed to return the top-k answers for all queries.  ...  It computes and executes a set of relational sub-queries whose results are combined to produce the k highest ranking answers.  ...  In addition, we seek to preserve an important aspect of top-k query processing work [5, 7, 9, 15, 23, 28] : the streaming sources (each with data sorted in nonincreasing order of score) are read in a  ... 
doi:10.1145/1989323.1989384 dblp:conf/sigmod/JacobI11 fatcat:qumcfaqgjfdgjpwscpfjztzsmq

Run-Time Adaptivity for Search Computing [chapter]

Daniele Braga, Michael Grossniklaus, Norman W. Paton
2011 Lecture Notes in Computer Science  
As a result, search computing seems likely to benefit from adaptive query processing, where information obtained during query evaluation is used to change the way in which a query is executing.  ...  In Search Computing, queries act over internet resources, and combine access to standard web services with exact results and to ranked search services.  ...  of results with top-k guarantees in the presence of unpredictable score information for search services, and also (c) because of the intervention of the end user in the query execution process, which  ... 
doi:10.1007/978-3-642-19668-3_15 fatcat:emxhr6wjavedbdgkx4gwzxszla
« Previous Showing results 1 — 15 out of 16,351 results