Filters








24,662 Hits in 5.0 sec

SparkCruise

Abhishek Roy, Alekh Jindal, Hiren Patel, Ashit Gosalia, Subru Krishnan, Carlo Curino
2019 Proceedings of the VLDB Endowment  
In the paper, we propose to demonstrate SparkCruise, a computation reuse system that automatically selects the most useful common computations to materialize based on the past query workload.  ...  SparkCruise materializes these computations as part of query processing, so the users can continue with their query processing just as before and computation reuse is automatically applied in the background  ...  Handling Workload Changes SparkCruise relies on the past workload being a strong indicator of the future.  ... 
doi:10.14778/3352063.3352082 fatcat:bebafyit5nbgtibyr764ltvuui

Revisiting Reuse in Main Memory Database Systems [article]

Kayhan Dursun, Carsten Binnig, Ugur Cetintemel, Tim Kraska
2016 arXiv   pre-print
Reusing intermediates in databases to speed-up analytical query processing has been studied in the past.  ...  The reason is that modern main memory DBMSs are typically limited by the bandwidth of the memory bus, thus query execution is heavily optimized to keep tuples in the CPU caches and registers.  ...  INTRODUCTION Motivation: Reusing intermediates in databases to speedup analytical query processing has been studied in the past [15, 25, 18, 13, 8, 20, 28] .  ... 
arXiv:1608.05678v1 fatcat:lqibmggfvreyxkxdv7uijnfigy

Revisiting Reuse in Main Memory Database Systems

Kayhan Dursun, Carsten Binnig, Ugur Cetintemel, TIm Kraska
2017 Proceedings of the 2017 ACM International Conference on Management of Data - SIGMOD '17  
As queries arrive, our reuseaware optimizer reasons about the reuse opportunities for hash tables, employing cost models that take into account hash table statistics together with the CPU and data movement  ...  Reusing intermediates in databases to speed-up analytical query processing was studied in prior work.  ...  This research is supported in part by the Intel Science and Technology Center for Big Data, NSF IIS-1526639 and NSF IIS-1514491.  ... 
doi:10.1145/3035918.3035957 dblp:conf/sigmod/DursunBCK17 fatcat:o7j5b6un3fethe53csln6yhbby

Computation Reuse in Analytics Job Service at Microsoft

Alekh Jindal, Sriram Rao, Shi Qiao, Hiren Patel, Zhicheng Yin, Jieming Di, Malay Bag, Marc Friedman, Yifung Lin, Konstantinos Karanasos
2018 Proceedings of the 2018 International Conference on Management of Data - SIGMOD '18  
The key aspects of our system are the following: (i) we reuse computations by creating materialized views over recurring workloads, i.e., periodically executing jobs that have the same script templates  ...  We present a detailed analysis from our production workloads to motivate the computation overlap problem and the possible gains from computation reuse.  ...  Both recurring and progressive optimization focus on the problem of inaccurate or missing statistics in query optimization, and not on reusing common subexpressions across jobs.  ... 
doi:10.1145/3183713.3190656 dblp:conf/sigmod/JindalQPYDBFLKR18 fatcat:46raoj4saba7vdj7kjneo6vf3m

A Framework to Subquery Optimization using Case-based Reasoning

Pragya Shukla, Sakshi Mathur
2014 International Journal of Computer Applications  
An efficient algorithm to identify similar queries in a given query and optimize the query based on similarity is presented.  ...  The key idea is to identify cases of similar subqueries that often appear in a complex query and share the optimization result within each case in the query [3] .  ...  We then review a representative selection of CBR research in the past few decades on aspects of retrieval, reuse, retention.  ... 
doi:10.5120/15324-3636 fatcat:q7eajohs6ve5jazwqpadgadkfi

Revisiting reuse for approximate query processing

Alex Galakatos, Andrew Crotty, Emanuel Zgraggen, Carsten Binnig, Tim Kraska
2017 Proceedings of the VLDB Endowment  
As part of our approach, we apply a variety of optimization techniques that are based on probability theory, including new query rewrite rules and index structures.  ...  However, existing AQP techniques start to break down when confronted with ad hoc queries that target the tails of the distribution.  ...  This research is funded in part by the NSF CAREER Award IIS-1453171, NSF Award IIS-1514491, Air Force YIP AWARD FA9550-15-1-0144, and the Intel Science and Technology Center for Big Data, as well as gifts  ... 
doi:10.14778/3115404.3115418 fatcat:osv3cjh3abfo3c66y24ua5363e

An Imitation Learning Approach for Cache Replacement [article]

Evan Zheran Liu, Milad Hashemi, Kevin Swersky, Parthasarathy Ranganathan, Junwhan Ahn
2020 arXiv   pre-print
When evaluated on 13 of the most memory-intensive SPEC applications, Parrot increases cache miss rates by 20% over the current state of the art.  ...  While directly applying Belady's is infeasible since the future is unknown, we train a policy conditioned only on past accesses that accurately approximates Belady's even on diverse and complex access  ...  We also thank Chelsea Finn, Lisa Lee, and Amir Yazdanbakhsh for their comments on a draft of this paper.  ... 
arXiv:2006.16239v2 fatcat:v2gm2hjanrdhnafysbswk7dsku

Recycling in pipelined query evaluation

F. Nagel, P. Boncz, S. D. Viglas
2013 2013 IEEE 29th International Conference on Data Engineering (ICDE)  
ones making best use of a limited intermediate result cache.  ...  The novelty of this paper is to show how recycling can successfully be applied in pipelined query executors, by tracking the benefit of materializing possible intermediate results and then choosing the  ...  Acknowledgments We would like to thank Milena Ivanova for supplying the SkyServer dataset used in [10] and the MonetDB and Vectorwise teams for their support.  ... 
doi:10.1109/icde.2013.6544837 dblp:conf/icde/NagelBV13 fatcat:fkdoiz35qfcttleiaz4ftaqgcu

Adaptive Cache Mode Selection for Queries over Raw Data

Tahir Azim, Azqa Nadeem, Anastasia Ailamaki
2018 Very Large Data Bases Conference  
Instead, the ideal caching mode depends on the workload, the dataset and the cache size. We further show that choosing the sub-optimal caching mode can result in a performance penalty of over 200%.  ...  Caching the results of intermediate query results for future re-use is a common technique for improving the performance of analytics over raw data sources.  ...  In the area of databases, a large body of work exists on the problem of caching and reusing the results of previous query executions to improve performance. Caching Disk Pages.  ... 
dblp:conf/vldb/AzimNA18 fatcat:qxutf5pcvzeknka6rrpfjxtkfa

Progressive optimization in a shared-nothing parallel database

Wook-Shin Han, Jack Ng, Volker Markl, Holger Kache, Mokhtar Kandil
2007 Proceedings of the 2007 ACM SIGMOD international conference on Management of data - SIGMOD '07  
Queries used in such large data warehouses can contain complex predicates as well as multiple joins, and the resulting query execution plans generated by the optimizer may be suboptimal due to mis-estimates  ...  Experimental results show that our solution has negligible runtime overhead and accelerates the performance of complex OLAP queries by up to a factor of 22.  ...  While considerable progress has been made over the past decade, parallel query optimization remains an active field of research fueled by the advent of data-intensive business intelligence and data warehousing  ... 
doi:10.1145/1247480.1247569 dblp:conf/sigmod/HanNMKK07 fatcat:rtweo2qkezhrbfym3dys3ik3de

ReStore: Reusing Results of MapReduce Jobs [article]

Iman Elghandour, Ashraf Aboulnaga
2012 arXiv   pre-print
ReStore can reuse the output of whole MapReduce jobs that are part of a workflow, and it can also create additional reuse opportunities by materializing and storing the output of query execution operators  ...  We have implemented ReStore as an extension to the Pig dataflow system on top of Hadoop, and we experimentally demonstrate significant speedups on queries from the PigMix benchmark.  ...  Acknowledgements This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) through the Business Intelligence Network strategic networks grant.  ... 
arXiv:1203.0061v1 fatcat:hwyv7mpidne33lg7ifo524o2bm

ReStore

Iman Elghandour, Ashraf Aboulnaga
2012 Proceedings of the VLDB Endowment  
ReStore can reuse the output of whole MapReduce jobs that are part of a workflow, and it can also create additional reuse opportunities by materializing and storing the output of query execution operators  ...  We have implemented ReStore as an extension to the Pig dataflow system on top of Hadoop, and we experimentally demonstrate significant speedups on queries from the PigMix benchmark.  ...  Acknowledgements This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) through the Business Intelligence Network strategic networks grant.  ... 
doi:10.14778/2168651.2168659 fatcat:cfuepiieavdirkg4tnuks7cooy

SQPR: Stream query planning with reuse

Evangelia Kalyvianaki, Wolfram Wiesemann, Quang Hieu Vu, Daniel Kuhn, Peter Pietzuch
2011 2011 IEEE 27th International Conference on Data Engineering  
Allocation decisions must provide the correct mix of resources required by queries, while achieving an efficient overall allocation to scale in the number of admitted queries.  ...  By exploiting overlap between queries and reusing partial results, a query planner can conserve resources but has to carry out more complex planning decisions.  ...  The authors would like to thank Marco Fiscato for his work on developing the prototype DISSP system.  ... 
doi:10.1109/icde.2011.5767851 dblp:conf/icde/KalyvianakiWVKP11 fatcat:osrmlscwuvddjhocpenioow35i

ReStore: Reusing results of MapReduce jobs

Ashraf Aboulnaga, Dr. Iman Elghandour
2012 Qatar Foundation Annual Research Forum Proceedings  
ReStore can reuse the output of whole MapReduce jobs that are part of a workflow, and it can also create additional reuse opportunities by materializing and storing the output of query execution operators  ...  We have implemented ReStore as an extension to the Pig dataflow system on top of Hadoop, and we experimentally demonstrate significant speedups on queries from the PigMix benchmark.  ...  Acknowledgements This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) through the Business Intelligence Network strategic networks grant.  ... 
doi:10.5339/qfarf.2012.aesnp3 fatcat:bzimxfw25nehfigofzuh2hq7bq

Approximate Query Engines

Barzan Mozafari
2017 Proceedings of the 2017 ACM International Conference on Management of Data - SIGMOD '17  
For example, we discuss how a database can reuse its past work in a generic way, and become smarter as it answers new queries.  ...  Recent years have witnessed a surge of interest in Approximate Query Processing (AQP) solutions, both in academia and the commercial world.  ...  Database Learning -Traditional databases have limited opportunities to reuse past query answers.  ... 
doi:10.1145/3035918.3056098 dblp:conf/sigmod/Mozafari17 fatcat:gqhlaf6ao5hbndiidhqzf3mchi
« Previous Showing results 1 — 15 out of 24,662 results