8,326 Hits in 6.6 sec

Improving the Performance of Pipelined Query Processing with Skipping [chapter]

Simon Jonassen, Svein Erik Bratsberg
2012 Lecture Notes in Computer Science  
In this paper, we evaluate the effect of inverted index skipping on the performance of pipelined query processing.  ...  However, the query processing latency and scalability with respect to the collections size are the main challenges associated with this method.  ...  This work was supported by the iAd Centre and funded by the Norwegian University of Science and Technology and the Research Council of Norway.  ... 
doi:10.1007/978-3-642-35063-4_1 fatcat:uc7ngktrenb2do5ualxmpyftfm

Improving the performance of pipelined query processing with skipping—and its comparison to document-wise partitioning

Simon Jonassen, Svein Erik Bratsberg
2013 World wide web (Bussum)  
In this paper, we evaluate the effect of inverted index skipping on the performance of pipelined query processing.  ...  However, the query processing latency and scalability with respect to the collections size are the main challenges associated with this method.  ...  This work was supported by the iAd Centre and funded by the Norwegian University of Science and Technology and the Research Council of Norway.  ... 
doi:10.1007/s11280-013-0260-2 fatcat:czrimud3ifd2tkm75bkt2u3tby

Efficient query processing in distributed search engines

Simon Jonassen
2012 SIGIR Forum  
Subsequently, we present several skipping extensions to pipelined query processing, which as we show can improve the query processing performance and/or the quality of results.  ...  Then, we extend one of these methods with intra-query parallelism, which as we show can improve the performance at low query loads.  ...  Øystein Acknowledgment: This work was supported by the iAd Project funded by the Research Council of Norway and the Norwegian University of Science and Technology.  ... 
doi:10.1145/2492189.2492201 fatcat:uwasxhngrfgntemkhawyv3te64

An evaluation of binary xml encoding optimizations for fast stream based xml processing

R. J. Bayardo, D. Gruhl, V. Josifovski, J. Myllymaki
2004 Proceedings of the 13th conference on World Wide Web - WWW '04  
of high performance XML stream processing.  ...  Our goal is to provide a deeper understanding of the performance impacts of binary XML encodings in order to clarify the ongoing and often contentious debate over their merits, particularly in the domain  ...  Skip-pointers provide the most substantial performance improvements, though only in limited circumstances, and at the expense of pipelining.  ... 
doi:10.1145/988672.988719 dblp:conf/www/BayardoGJM04 fatcat:hyzjr2xr6ndzpokfnbl75weh7q


Xin Rong, Zhe Chen, Qiaozhu Mei, Eytan Adar
2016 Proceedings of the Ninth ACM International Conference on Web Search and Data Mining - WSDM '16  
Empirical evaluation against state-ofthe-art baselines shows that our solution, EgoSet, is able to not only capture multiple facets in the input query, but also generate expansions for each facet with  ...  In this paper, we present a novel solution to handling multifaceted seeds by combining existing user-generated ontologies with a novel wordsimilarity metric based on skip-grams.  ...  Acknowledgments This work is partially supported by the National Science Foundation under grant numbers IIS-1054199 and CCF-1048168. We thank our reviewers for very helpful comments and suggestions.  ... 
doi:10.1145/2835776.2835808 dblp:conf/wsdm/RongCMA16 fatcat:yfjg4asinzel3n7ea3rwgh6hle

Vectorizing an In Situ Query Engine

Panagiotis Sioulas, Anastasia Ailamaki
2016 Proceedings of the 2016 International Conference on Management of Data - SIGMOD '16  
Specifically, we examine the effect of SIMD on two different cases: the scan operators that perform the CPUintensive task of input parsing, and the part of the query pipeline that performs a selection  ...  We show that a vectorized approach has a lot of potential to improve performance, which nevertheless comes with trade-offs.  ...  Then, we extend the query engine with SIMD processing primitives to further reduce processing costs, which allow performing the same operation for a vector of data at a time, thus taking advantage of data  ... 
doi:10.1145/2882903.2914829 dblp:conf/sigmod/SioulasA16 fatcat:kikxqopcuzdmdmoq3wamezhesy

Editorial for the special issue on advanced information systems for the Web

Alex Delis, X. Sean Wang
2014 World wide web (Bussum)  
by novel techniques for the provision of highavailability for data, and finally, by policies for the orderly operation of the co-existing and collaborative virtual ecosystems.  ...  The conference attracted 194 submissions of which 44 were finally presented. In this special issue, we offer extended versions of selected works from the WISE'12 conference.  ...  The forth paper entitled "Improving the Performance of Pipelined Query Processing with Skipping" investigates the effect of inverted index skipping on the performance of pipelined query processing for  ... 
doi:10.1007/s11280-014-0281-5 fatcat:qb2twi2gq5hwpn4n7ftftvtxku

Push vs. Pull-Based Loop Fusion in Query Engines [article]

Amir Shaikhha, Mohammad Dashti, Christoph Koch
2016 arXiv   pre-print
only been used with one of the approaches.  ...  We draw parallels between the DB and PL communities by demonstrating the connection between pipelined query engines and loop fusion techniques.  ...  One of the main reasons is that the inline-aware implementation generates around 40% less query processing code in comparison with the naïve implementation for query processing in these two queries.  ... 
arXiv:1610.09166v1 fatcat:2ohk4kxmyvhijkw5f4v7vc7n2y

Skewed partial bitvectors for list intersection

Andrew Kane, Frank Wm. Tompa
2014 Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval - SIGIR '14  
This query processing pipeline is summarized in Figure 2 .2. The intersection and ranking steps of the query processing pipeline are the most timeconsuming parts of a large search system.  ...  In all cases, using bitvectors greatly improves the runtime performance of the slow queries, as shown by comparing the performance of bitvectors+skips to the performance of skips in Figure 6 .4 (bottom  ... 
doi:10.1145/2600428.2609609 dblp:conf/sigir/KaneT14 fatcat:fuwxdwf7x5cd7mie277owhkdt4

BigDataStack - D4.1 WP4 Scientific Report and Prototype Description - Y1

Paula Ta Shma, Yosef Moatti, Stathis Plitsos, Guy Khazma, Pavlos Kranas, Luis Tomás Bolívar, Marta Patiño, Ainhoa Azqueta, Christos Doulkeridis, Peter Jason Gould, Dimitris Poulopoulos
2020 Zenodo  
The Data as a Service block presents a fine set of data services which can be mapped to the major phases of Big Data processing.  ...  The architecture and the design of these data services are achieved through dedicated techniques, contextualized in the BigDataStack environment in order to run on top of the data-driven infrastructure  ...  Thousands of events can be generated and processed per minute. • Performance evaluation with stateful queries over windows of time (2 minutes) coming from thousands of sensors. • Running the CEP on hardware  ... 
doi:10.5281/zenodo.4005495 fatcat:gqqmiuqbybd67hie7ilzteghcq

Scalable join processing on very large RDF graphs

Thomas Neumann, Gerhard Weikum
2009 Proceedings of the 35th SIGMOD international conference on Management of data - SIGMOD '09  
With the proliferation of the RDF data format, engines for RDF query processing are faced with very large graphs that contain hundreds of millions of RDF triples.  ...  The current paper focuses on join processing, as the fine-grained and schema-relaxed use of RDF often entails star-and chain-shaped join queries with many input streams from index scans.  ...  The very light-weight "ubiquitous" sideways information passing in pipelined operator trees is a novel method that can greatly accelerate index scans, the basic "workhorse" in query processing with a huge  ... 
doi:10.1145/1559845.1559911 dblp:conf/sigmod/NeumannW09 fatcat:l663lpsfkfhnfk3acyfiaspimi

Smart Intra-query Fault Tolerance for Massive Parallel Processing Databases

Yunhong Ji, Yunpeng Chai, Xuan Zhou, Lipeng Ren, Yajie Qin
2019 Data Science and Engineering  
The experimental results indicate that it can improve success rate of query processing effectively, especially when working with unreliable hardware.  ...  SIFT achieves fault tolerance by performing checkpointing, i.e., materializing intermediate results of selected operators.  ...  To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.  ... 
doi:10.1007/s41019-019-00114-z fatcat:kwugnnr6dzhq7olonwz7ytqmrq

NCBI BLASTP on High-Performance Reconfigurable Computing Systems

Atabak Mahram, Martin C. Herbordt
2015 ACM Transactions on Reconfigurable Technology and Systems  
Third is pipelining the filters in a single design, including maintaining load balancing as data are reduced by orders of magnitude at each stage.  ...  First is design of the filters themselves, which perform two-hit seeding, exhaustive ungapped alignment, and exhaustive gapped alignments, respectively.  ...  ACKNOWLEDGMENTS The authors would like to thank Tom VanCourt, Yongfeng Gu, Bharat Sukhwani, Josh Model, Jin Park, and Yunfei Qiu who contributed to preliminary versions of this work.  ... 
doi:10.1145/2629691 fatcat:aqtru5wepbegvk4p6gapqgvi5q

Multi-Stream Transformers [article]

Mikhail Burtsev, Anna Rumshisky
2021 arXiv   pre-print
hypotheses improves performance, with further improvement obtained by adding a skip connection between the first and the final encoder layer.  ...  We investigate the effects of allowing the encoder to preserve and explore alternative hypotheses, combined at the end of the encoding process.  ...  Performance of the 6-layer models is significantly improved by presence of the skip connection. A model with two streams consisting of two layers each (Multistream 2(2)) shows the worst results.  ... 
arXiv:2107.10342v1 fatcat:fbaqamuxrnb57f6mvo5rwrq2bq

D4.2 – WP4 Scientific Report and Prototype Description – Y2

Yosef Moatti, Stathis Plitsos, Paula Ta Shma, Guy Khazma, Javier López Moratalla, Sandra Ebro, Pavlos Kranas, Luis Tomás Bolívar, Marta Patiño, Ainhoa Azqueta, Christos Doulkeridis, Maria Kanakari (+2 others)
2021 Zenodo  
A full demonstration of the capabilities offered by the data services has been performed during the interim review of the project, in which all the components have been integrated and interacted to demonstrate  ...  naturally at the core of BigDataStack.  ...  Its performance should be improved. This is critical if we detect a change of query load and if we want the flexibility to re-index (parts of) existing dataset to improve the data skipping score.  ... 
doi:10.5281/zenodo.4442297 fatcat:z35hkxcy2jbq7go7cgwscrezm4
« Previous Showing results 1 — 15 out of 8,326 results