18,312 Hits in 8.1 sec

A special issue in extending data warehouses to big data analytics

Ladjel Bellatreche, Sharma Chakravarthy
2019 Distributed and parallel databases  
(structured, unstructured, and varied data types), modeling, query languages (SQL and beyond), analysis, parallel systems technology (Spark, HDFS), etc.  ...  As a consequence, the data warehousing community has to deal with data lakes (schema-free repositories), data warehouse design (data curation, data flow management and optimization), Big Data Management  ...  Query rewriting algorithm based on generated fragments and query execution strategies are given and experimentally evaluated.  ... 
doi:10.1007/s10619-019-07262-1 fatcat:j37k5v3dzzamnh6yyukn7zaxse

Volcano-an extensible and parallel query evaluation system

G. Graefe
1994 IEEE Transactions on Knowledge and Data Engineering  
The Volcano effort provides a rich environment for research and education in database systems design, heuristics for query optimization, parallel query execution, and resource allocation.  ...  To investigate the interactions of extensibility and parallelism in database query processing, we have developed a new dataflow query execution system called Volcano.  ...  GAMMA [ 1 l] is a software database machine running on a number of general-purpose CPU's as a backend to a UNIX host machine.  ... 
doi:10.1109/69.273032 fatcat:s54pmf33pjftnpqjq5kwpxzlrq

Robust heuristic algorithms for exploiting the common tasks of relational cloud database queries

Tansel Dokeroglu, Murat Ali Bayir, Ahmet Cosar
2015 Applied Soft Computing  
One can rent a large amount of resources for a short duration in order to run complex queries efficiently on large-scale data with virtual machine clusters.  ...  generation methods for relational Cloud databases, where the site locations of join tasks can be decided http://dx.  ...  In a data flow query execution model, machines can work in parallel and a better data sharing can be achieved across the sites.  ... 
doi:10.1016/j.asoc.2015.01.026 fatcat:aa757oq3tfhqpanh7ac2eu6plu

Control versus data flow in parallel database machines

W.B. Teeuw, H.M. Blanken
1993 IEEE Transactions on Parallel and Distributed Systems  
The execution of a query in a parallel database machine can be controlled in either a control flow way, or in a data flow way.  ...  Index Terms-Control flow, data flow, database system performance, distributed databases, local area networks, message management, parallel query execution.  ...  INTRODUCTION HE exploitation of parallelism in a database system T is different from the use of parallelism in a general purpose computer system.  ... 
doi:10.1109/71.250104 fatcat:hupge3jndrfu5aydtljpqxtrru

Structured Parallel Efficient Execution Database Management System Over Enormous Dataset with MapReduce using Matlab

Uma Mahesh Kumar Gandham, P. Suresh Varma
2017 Indian Journal of Science and Technology  
Objective: MapReduce is an encoding representation and a connected execution for handing out and generate huge data set.  ...  Methodology: The present paper uses structured parallel efficient execution Database Management System i.e. Parallel Database Management Systems (PDBMS).  ...  Adaptive Efficient Query Optimization The following steps are involved in the execution of a MSQL query in a parallel database system 4 Real time Plan Selection Choose a well-organized completing  ... 
doi:10.17485/ijst/2017/v10i20/108262 fatcat:bsxkvxm45bhudejgcpcxalgaxi

Bridging Two Worlds with RICE Integrating R into the SAP In-Memory Computing Engine

Philipp Große, Wolfgang Lehner, Thomas Weichert, Franz Färber, Wen-Syan Li
2011 Proceedings of the VLDB Endowment  
We developed two novel approaches towards a solution for this basic conflict, based on the widely-used statistical software package R and the SAP In-Memory Computing Engine (IMCE).  ...  The growing need to use large amounts of data as the basis for sophisticated business analysis conflicts with the current capabilities of statistical software systems as well as the functions provided  ...  their numerous contributions to the preparation of this paper and their constant work on the software development.  ... 
dblp:journals/pvldb/GrosseLWFL11 fatcat:75wcbrcrurg2rfzivb26ebykpu

PDRS: A Performance Data Representation System [chapter]

Xian-He Sun, Xingfu Wu
2000 Lecture Notes in Computer Science  
We present the design and development of a Performance Data Representation System (PDRS) for scalable parallel computing.  ...  PDRS provides decision support that helps users find the right data to understand their programs' performance and to select appropriate ways to display and analyze it.  ...  PDRS is only a first step toward the automatic performance analysis and optimization.  ... 
doi:10.1007/3-540-45591-4_37 fatcat:47rxuuyuh5cbzhl2cltewwi65e

High level synthesis of RDF queries for graph analytics

Vito Giovanni Castellana, Marco Minutoli, Alessandro Morari, Antonino Tumeo, Marco Lattuada, Fabrizio Ferrandi
2015 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)  
The GEMS' front-end generates optimized C implementations of the input queries, modeled as graph pattern matching algorithms, which are then automatically synthesized by Bambu.  ...  Among the multitude of algorithms that may benefit from our solution, we focus on the acceleration of graph analytics applications and, in particular, on the synthesis of SPARQL queries on Resource Description  ...  Because of these characteristics, graph-based algorithms and, in general, data analytics perform poorly on these systems.  ... 
doi:10.1109/iccad.2015.7372587 dblp:conf/iccad/CastellanaMMTLF15 fatcat:bg55avgzj5ahlp6w5ciapgrksa

Integrated IoT and Cloud Environment for Fingerprint Recognition [article]

Ehsan Nadjaran Toosi and Adel Nadjaran Toosi and Reza Godaz and Rajkumar Buyya
2018 arXiv   pre-print
In this paper, we propose a system for large-scale fingerprint matching application using Aneka, a platform or developing scalable applications on the Cloud.  ...  However, harnessing cloud resources for large-scale big data computation is application specific to a large extent.  ...  CUDA is a parallel computing framework which enables software developers to use GPU for general purpose processing.  ... 
arXiv:1807.08099v1 fatcat:52m2ltcd6fathkgduhsheqfydy

Parallel processing on Big Data in the context of Machine Learning and Hadoop Ecosystem: A Survey

Anilkumar Vishwanath Brahmane1, R Murugan
2018 International Journal of Engineering & Technology  
In reality, a lot of institutes, businesses and in general entire society from diverse segments depend more and more on information take out from enormous quantity of raw information, statistics and numbers  ...  In this paper an evaluation is done, this studies recent technologies developed for Big Data.  ...  Distributed non- relational database. Built on top of HDFS. Query and search performance HDFS is not a general purpose file system.  ... 
doi:10.14419/ijet.v7i2.7.10885 fatcat:goyvvzlwsbeifi62nrldkgp3yy

Parallel Guided Dynamic Programming Approach for DNA Sequence Similarity Search

A. R. M. Nordin, M. S. M. Yazid, A. Aziz, M. T. A. Osman
2009 International Journal of Computer and Electrical Engineering  
This paper discusses the parallel model for FRA-Search application. The parallel FRA-Search model is implemented on PC-based cluster system.  ...  It is developed on a single program multiple data (SPMD) architecture and MPJ Express software is used as a communication interface protocol between processors.  ...  For the experiment purposes, a set of 30,000 DNA sequences are pickup randomly from five GenBank databases.  ... 
doi:10.7763/ijcee.2009.v1.61 fatcat:yxwzplth7ber3pd37pripz4p5i

Protein Sequence Comparison on the Instruction Systolic Array [chapter]

Bertil Schmidt, Heiko Schröder, Manfred Schimmler
2001 Lecture Notes in Computer Science  
To derive an efficient mapping onto this architecture, we designed a fine-grained parallel sequence comparison algorithm.  ...  This results in an implementation with significant runtime savings on Systola 1024, a parallel computer of this particular architecture.  ...  a 10Mbase search with the Smith-Waterman algorithm on different parallel machines for different query lengths.  ... 
doi:10.1007/3-540-44743-1_52 fatcat:5thwam4n3jgz7j2ne52h4mphfe

Design and Evaluation of Distributed Smart Disk Architecture for I/O-Intensive Workloads [chapter]

Steve Chiu, Wei-keng Liao, Alok Choudhary
2003 Lecture Notes in Computer Science  
We evaluate a distributed smart disk architecture with representative I/O-intensive workloads including TPC-H queries, association rule mining, data clustering, and 2-D fast Fourier transform applications  ...  Smart disks, a type of processor-embedded active I/O devices, with their on-disk memory and network interface controller, can be viewed as processing elements with attached storage.  ...  TPC-H Queries For distributed SD architectures, performing parallel database operations is one of the logical approaches to exploit data parallelism.  ... 
doi:10.1007/3-540-44864-0_24 fatcat:7ba2adq5xngrrbqppaxjdkhoay

Parallel database systems

David J. DeWitt, Jim Gray
1990 SIGMOD record  
Parallel database machine architectures based on exotic hardware have evolved to a parallel database systems running atop a parallel dataflow software architecture based on conventional shared-nothing  ...  These new designs provide speedup and scaleup when processing relational database queries. This paper reviews the techniques used by such systems, and surveys current commercial and research systems.  ...  The real answer is that special-purpose database machines have indeed failed. But, parallel database systems have been a big success.  ... 
doi:10.1145/122058.122071 fatcat:4mjtunvs2bav5c27z2jjdeu4zy

Are Graph Databases Fast Enough for Static P4 Code Analysis?

Dániel Lukács, Gergely Pongrácz, Máté Tejfel
2020 International Conference on Applied Informatics  
Our current work in progress is focused on analysing the execution cost of P4 programs using hierarchical control flow graphs (CFGs).  ...  On the other hand, analysis efficiency is a requirement both for large-scale testing and end user application.  ...  All queries were executed on a single core.  ... 
dblp:conf/icai3/LukacsPT20 fatcat:3zd6cunlzzfyfitjxdhxy55foi
« Previous Showing results 1 — 15 out of 18,312 results