426 Hits in 5.5 sec

DIAMetrics: Benchmarking Query Engines at Scale

Anja Gruenheid, Shaleen Deep, Kruthi Nagaraj, Hiro Naito, Jeffrey F. Naughton, Stratis Viglas
2020 Proceedings of the VLDB Endowment  
This paper introduces DIAMetrics: a novel framework for end-to-end benchmarking and performance monitoring of query engines.  ...  DIAMetrics has been developed in Google and is being used to benchmark a number of internal query engines. In this paper, we give an overview of DIAMetrics and discuss its design and implementation.  ...  In essence, DIAMetrics enables efficient and consistent benchmarking at scale within Google. FRAMEWORK COMPONENTS We next present the components of the DIAMetrics framework in detail.  ... 
dblp:journals/pvldb/GruenheidDNNNV20 fatcat:bzbege45fbebhoaspy6kdunuc4

Technical Perspective DIAMetrics

Peter Boncz
2021 SIGMOD record  
Creating good benchmarks has been described as something of an art [3].  ...  One can inspire dataset and workload design from"representative" use cases queries, typically informed by domain experts; but also exploit technical insights from database architects in what features,  ...  At Google, the DIAMetrics framework has proven itself useful for database developers, for performance monitoring and regression testing, as engines evolve.  ... 
doi:10.1145/3471485.3471491 fatcat:2po7xbkqkbgcnc5jpledjgoqci

Comprehensive and Efficient Workload Compression [article]

Shaleen Deep, Anja Gruenheid, Paraschos Koutris, Jeffrey Naughton, Stratis Viglas
2021 arXiv   pre-print
This work studies the problem of constructing a representative workload from a given input analytical query workload where the former serves as an approximation with guarantees of the latter.  ...  These metrics capture the intuition that the distribution of features in a compressed workload should match a target distribution, increasing representativity, and include common queries as well as outliers  ...  DIAMetrics [10] is an end-toend benchmarking system developed at Google for query engineagnostic, repeatable benchmarking that is indicative of large-scale production performance.  ... 
arXiv:2011.05549v2 fatcat:owo3fie7zfaadap3o4bhceqf7q


Robin Rehrmann, Carsten Binnig, Alexander Böhm, Kihong Kim, Wolfgang Lehner, Amr Rizk
2018 Proceedings of the VLDB Endowment  
At first sight, OLTP queries -due to their short runtime -may not have enough potential for the additional overhead. In addition, OLTP workloads do not only execute read operations but also updates.  ...  In this paper, we address query sharing for OLTP workloads. We first analyze the sharing potential in real-world OLTP workloads.  ...  The benchmark consists of six workloads, each being a mix of four query types (READ, SCAN, INSERT and UPDATE) with a specified ratio of different query types running at the same time, e.g. workload A consists  ... 
doi:10.14778/3229863.3229866 fatcat:6yga7ytmlrggbick2nhfgqjumu

Are We There Yet? A Decision Framework for Replacing Term Based Retrieval with Dense Retrieval Systems [article]

Sebastian Hofstätter, Nick Craswell, Bhaskar Mitra, Hamed Zamani, Allan Hanbury
2022 arXiv   pre-print
Established retrieval systems running at scale are usually well understood in terms of effectiveness and costs, such as query latency, indexing throughput, or storage requirements.  ...  The guardrails check for failures on certain query characteristics and novel failure types that are only possible in dense retrieval systems.  ...  C-Bias Search engines, like other large-scale information access systems, act as gatekeepers to the world's information.  ... 
arXiv:2206.12993v1 fatcat:a66ji4dl2zhs5c32zsjulna3tm

Good Applications for Crummy Entity Linkers?

Alex Olieman, Kaspar Beelen, Milan van Lange, Jaap Kamps, Maarten Marx
2017 Proceedings of the 13th International Conference on Semantic Systems - Semantics2017  
For instance, should links be specified at the level of a document, or at the level of an individual phrase?  ...  At the least, EL benchmarks should be clearly documented, and would be more useful if they incorporated text from different domains and publication times, featuring a variety of long-tail entity mentions  ... 
doi:10.1145/3132218.3132237 dblp:conf/i-semantics/OliemanBLKM17 fatcat:n6wi2tsq7jap3fegkgyyp4cn64

Good Applications for Crummy Entity Linkers? The Case of Corpus Selection in Digital Humanities [article]

Alex Olieman, Kaspar Beelen, Milan van Lange, Jaap Kamps, Maarten Marx
2017 arXiv   pre-print
For instance, should links be specified at the level of a document, or at the level of an individual phrase?  ...  At the least, EL benchmarks should be clearly documented, and would be more useful if they incorporated text from different domains and publication times, featuring a variety of long-tail entity mentions  ... 
arXiv:1708.01162v1 fatcat:bura6ssx2fgpvjia22pbgkynhe

A Geometric Distance Oracle for Large Real-World Graphs [article]

Deepak Ajwani, W. Sean Kennedy, Alessandra Sala, Iraj Saniee
2014 arXiv   pre-print
There are two sets of prior work against which we benchmark our approach.  ...  There is clearly a need for implementations of graph computational primitives at this scale.  ...  (Query Speed) the time required to query the distance between any two nodes should be very small (e.g., small fraction of a second).In this paper, we focus on a distance oracle for the large-scale graphs  ... 
arXiv:1404.5002v1 fatcat:55auyg3o2jfdvdy74craz3siea

View Sphere Partitioning via Flux Graphs Boosts Recognition from Sparse Views

Morteza Rezanejad, Kaleem Siddiqi
2015 Frontiers in ICT  
Our experiments on exemplar level recognition using 19 models from the Toronto Database and category-level recognition using 150 models from the McGill Shape Benchmark demonstrate that in a scenario of  ...  View-based 3D object recognition requires a selection of model object views against which to match a query view. Ideally, for this to be computationally efficient, such a selection should be sparse.  ...  FUNDING We are grateful to the Natural Sciences and Engineering Research Council of Canada (NSERC) for funding this research.  ... 
doi:10.3389/fict.2015.00024 fatcat:aghuwj6tnfc6lkknwnoctgtfye

Evolutionary Trace for Prediction and Redesign of Protein Functional Sites [chapter]

Angela Wilkins, Serkan Erdin, Rhonald Lua, Olivier Lichtarge
2011 Msphere  
Taking initially an absolute view of variation patterns ( 1 ), the ET rank r i of sequence residue i in a query protein was:  ...  Public ET servers are located at: The ET approach to measure the correlation between residue and phylogenetic variations is still under refinement.  ...  On a large-scale SG set, accuracy rose 6% and false positives fell twofold at 65% coverage, compared to ETA. In practice, ETA predictions are being validated experimentally ( 30 ) .  ... 
doi:10.1007/978-1-61779-465-0_3 pmid:22183528 pmcid:PMC4892863 fatcat:h5sqfghcfzcr3hfrpcbqvkr4yq

Monitoring civil structures with a wireless sensor network

K. Chintalapudi, T. Fu, J. Paek, N. Kothari, S. Rangwala, J. Caffrey, R. Govindan, E. Johnson, S. Masri
2006 IEEE Internet Computing  
Such intervention is acceptable at small scales.  ...  netSHM Wisden was a handcrafted system we developed in consultation with structural engineers, but developing additional tools for structural engineers in this manner clearly doesn't scale.  ... 
doi:10.1109/mic.2006.38 fatcat:7ef2xtafqfgudnkxv44hzowcwa

Big Data Analytics [article]

Ahmed Masmoudi
2017 Zenodo  
Using PageRank in a Search Engine Order to be considered for the ranking at all, a page has to have at least one of the search terms in the query.  ...     Yahoo Gridmix3  Hadoop cluster benchmarking from Yahoo engineer team. TODO PUMA BenchmarkingBenchmark suite which represents a broad range of MapReduce 1.  ... 
doi:10.5281/zenodo.573349 fatcat:qg7licyavbgbtph6jadfm6bncu

Let's get less optimistic in measurement-based timing analysis

Sven Bunte, Michael Zolda, Raimund Kirner
2011 2011 6th IEEE International Symposium on Industrial and Embedded Systems  
The size of the input list for bubble_sort is reduced from 100 to 10 as we utilize a bounded model checker that does not scale well for this particular benchmark.  ...  In contrast, this article, targets optimism as a diametrically opposed aspect.  ... 
doi:10.1109/sies.2011.5953663 dblp:conf/sies/BunteZK11 fatcat:txbqyhtbznfmfmyaw6gkwhgchi

Adaptive Self-tuning Memory in DB2

Adam J. Storm, Christian Garcia-Arellano, Sam Lightstone, Yixin Diao, Maheswaran Surendra
2006 Very Large Data Bases Conference  
At first we ran 16 concurrent streams of TPC-H query 13, a decision-support query with low requirements for sort memory.  ...  The workload being run by each of the databases consisted of 4 clients, each running the 22 queries used in the TPC-H benchmark.  ... 
dblp:conf/vldb/StormGLDS06 fatcat:ipq4fpgudzawpn2ervme2fmspa

Face-centred Voronoi Refinement for Surface Mesh Generation

Darren Engwirda, David Ivers
2014 Procedia Engineering  
Experiments are conducted using a range of complex benchmarks, verifying the robustness and practical performance of the proposed scheme.  ...  These queries are computed efficiently by storing the surface definition P in an aabb-tree [23] .  ...  Such scaling ensures that size constraints are applied with respect to mean edge length.  ... 
doi:10.1016/j.proeng.2014.10.364 fatcat:gelixfevx5hd3aiyifogzxwdeq
« Previous Showing results 1 — 15 out of 426 results