Filters








1,309 Hits in 3.6 sec

Customizable Parallel Execution of Scientific Stream Queries

Milena Ivanova, Tore Risch
2005 Very Large Data Bases Conference  
Window split provides operators for parallel execution of query functions by reducing the size of stream data units using application dependent functions as parameters.  ...  Using a generic template we define two partitioning strategies for scalable parallel execution of expensive stream queries: window split and window distribute.  ...  We would like to thank the team of prof. Bo Thidé at Swedish Institute of Space Physics and Uppsala University for the useful discussions and application problems and data provided.  ... 
dblp:conf/vldb/IvanovaR05 fatcat:uag6fjzbmvb2djn7bs6o2no744

High-Performance GRID Stream Database Manager for Scientific Data [chapter]

Milena Gateva Koparanova, Tore Risch
2004 Lecture Notes in Computer Science  
In this work we describe a high-performance stream-oriented distributed database manager and query processor under development that allows efficient execution of database queries to streamed data involving  ...  processing of scientific streams.  ...  For a given CQ the query compiler of the query coordinator will construct a distributed execution plan accessing other GSDM servers or source streams.  ... 
doi:10.1007/978-3-540-24689-3_11 fatcat:ihy7smv5t5hrhe22vik2xnopfm

Massive scale-out of expensive continuous queries

Erik Zeitler, Tore Risch
2011 Proceedings of the VLDB Endowment  
Scalable execution of expensive continuous queries over massive data streams requires input streams to be split into parallel substreams.  ...  The query operators are continuously executed in parallel over these sub-streams.  ...  Scalable execution of such continuous queries with expensive computations requires input streams to be split into parallel sub-streams over which the expensive query operators are continuously executed  ... 
doi:10.14778/3402707.3402752 fatcat:fcmgs3gfyvhcvngmubw3sefr6y

Managing Long Running Queries in Grid Environment [chapter]

Ruslan Fomkin, Tore Risch
2004 Lecture Notes in Computer Science  
We propose a customizable Grid-based query processor built on top of an established Grid infrastructure, NorduGrid.  ...  This will enable efficient exchange and processing of very large amounts of data combined with CPU intensive computations, as required by many scientific applications.  ...  The secure communication is implemented by Mehran Ahsant from Center for Parallel Computers, Royal Institute of Technology, Stockholm.  ... 
doi:10.1007/978-3-540-30470-8_28 fatcat:ul6r6c6igfcm5grf4vst5hhmiu

Distributed context-aware visualization

Harald Sanftmann, Nazario Cipriani, Daniel Weiskopf
2011 2011 IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops)  
We present the integration of visualization modules into a Javabased stream processing framework for context-aware systems, with focus on efficient communication and parallelization.  ...  In our case, stream processing is used, supporting parallelism on distributed and shared memory multiprocessors.  ...  Stream ribbon calculation time (brown) refers to the execution of the stream ribbon calculation.  ... 
doi:10.1109/percomw.2011.5766878 dblp:conf/percom/SanftmannCW11 fatcat:lag4wqeqxnhpnf2ujden6mujcq

Big Data Management: What to Keep from the Past to Face Future Challenges?

G. Vargas-Solar, J. L. Zechinelli-Martini, J. A. Espinosa-Oviedo
2017 Data Science and Engineering  
According to this model, resources provision must consider the economic cost of the processes versus the use and parallel exploitation of available computing resources.  ...  In consequence, new methodologies, algorithms and tools for querying, deploying and programming data management functions have to be provided in scalable and elastic architectures that can cope with the  ...  Relational queries are ideally suited to parallel execution because they consist of uniform operations applied to uniform streams of data.  ... 
doi:10.1007/s41019-017-0043-3 fatcat:ce5uchapafa5pbxrgeryb7jfii

Parallel data analysis directly on scientific file formats

Spyros Blanas, Kesheng Wu, Surendra Byna, Bin Dong, Arie Shoshani
2014 Proceedings of the 2014 ACM SIGMOD international conference on Management of data - SIGMOD '14  
Our design leverages the increasing main memory capacities found in supercomputers through bitmap indexing and in-memory query execution.  ...  Scientific experiments and large-scale simulations produce massive amounts of data. Many of these scientific datasets are arrays, and are stored in file formats such as HDF5 and NetCDF.  ...  Acknowledgements We would like to acknowledge the insightful comments and suggestions of three anonymous reviewers that greatly improved this paper.  ... 
doi:10.1145/2588555.2612185 dblp:conf/sigmod/BlanasWBDS14 fatcat:tfpgk6x25vefdh6ayqrtkyblly

Provenance Collection Support in the Kepler Scientific Workflow System [chapter]

Ilkay Altintas, Oscar Barney, Efrat Jaeger-Frank
2006 Lecture Notes in Computer Science  
Introduction Current technology significantly accelerates the scientific problem solving process by allowing scientists to access data remotely, distribute job execution across remote parallel resources  ...  Scientific workflow systems [1,2,3], aim to improve this situation by creating interfaces to a variety of technologies and providing tools with domain-independent customizable graphical user interfaces  ...  Acknowledgements The authors would like to thank the rest of the Kepler team for their excellent collaboration, especially to Timothy McPhillips for the discussion on requirements for a provenance framework  ... 
doi:10.1007/11890850_14 fatcat:7flkwudhvvdkfl4kncwzfcvz4q

WORKEM: Representing and Emulating Distributed Scientific Workflow Execution State

Lavanya Ramakrishnan, Dennis Gannon, Beth Plale
2010 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing  
Our evaluation shows that the framework has minimal overheads and can be scaled to run hundreds of workflows in short durations of time and with a high amount of parallelism.  ...  There is a need for a customizable, isolated and manageable testing container for design, evaluation and deployment of distributed workflows.  ...  The workflows are from diverse scientific domains and have different levels of parallelism and length or duration of the workflow. B.  ... 
doi:10.1109/ccgrid.2010.89 dblp:conf/ccgrid/RamakrishnanGP10 fatcat:kt4lvf5fwzerfc2rb3dx5pfkea

An Approach for Processing Large and Non-uniform Media Objects on MapReduce-Based Clusters [chapter]

Rainer Schmidt, Matthias Rella
2011 Lecture Notes in Computer Science  
The application is capable of analyzing and modifying large audiovisual files using multiple computer nodes in parallel and thereby able to dramatically reduce processing times.  ...  Moreover, we summarize key concepts of the implementation and provide a brief evaluation.  ...  It has also been demonstrated that these programming models are applicable to types of scientific applications, like loosely-coupled and massively parallel problems, e.g. found in bioinformatics [4] .  ... 
doi:10.1007/978-3-642-24826-9_23 fatcat:33fse3sqxjhezgcnkjv7bvyleq

The Case for Multi-Engine Data Analytics [chapter]

Dimitrios Tsoumakos, Christos Mantas
2014 Lecture Notes in Computer Science  
After summarizing some of the current approaches in data analytics, we outline the structure of our envisioned Multi-Engine Management System and present some of the corresponding research directions in  ...  In this paper we argue on the need of a multi-engine environment that will exploit the largely different models, cost and quality of the existing analytics engines.  ...  low query execution time of parallel DBMSs.  ... 
doi:10.1007/978-3-642-54420-0_40 fatcat:4yd56fw575eptds4kohxkeqzqe

Managing scientific data

Anastasia Ailamaki
2011 Proceedings of the 2011 international conference on Management of data - SIGMOD '11  
managing the enormous amount of scientific data being collected is the key to scientific progress. though technology allows for the extreme collection rates of scientific data, processing is still performed  ...  Observation and simulation of phenomena are keys for proving scientific theories and discovering facts of ˲ ˲ Floating-point heavy; and ˲ ˲ Low update rates, with most updates append-only. key insights  ...  limited uniform query optimization and limited query-execution efficiency.  ... 
doi:10.1145/1989323.1989433 dblp:conf/sigmod/Ailamaki11 fatcat:3cxviarugnfoldnqz3csgjczqu

Managing scientific data

Anastasia Ailamaki, Verena Kantere, Debabrata Dash
2010 Communications of the ACM  
managing the enormous amount of scientific data being collected is the key to scientific progress. though technology allows for the extreme collection rates of scientific data, processing is still performed  ...  Observation and simulation of phenomena are keys for proving scientific theories and discovering facts of ˲ ˲ Floating-point heavy; and ˲ ˲ Low update rates, with most updates append-only. key insights  ...  limited uniform query optimization and limited query-execution efficiency.  ... 
doi:10.1145/1743546.1743568 fatcat:vw57d23aorchtntng6jlrccs6y

High performance spatial queries for spatial big data

Fusheng Wang, Ablimit Aji, Hoang Vo
2015 SIGSPATIAL Special  
Hadoop-GIS supports multiple types of spatial queries on MapReduce through skew-aware spatial partitioning, on-demand indexing, customizable spatial query engine RESQUE, implicit parallel spatial query  ...  execution on MapReduce, and effective methods for amending query results through handling boundary objects.  ...  MapReduce Based Parallel Query Execution Instead of using explicit spatial query parallelization as summarized in [7] , we take an implicit parallelization approach by leveraging MapReduce.  ... 
doi:10.1145/2766196.2766199 fatcat:esmpfrnt6bc35pqcmh2g66jfdm

SPADE

Bugra Gedik, Henrique Andrade, Kun-Lung Wu, Philip S. Yu, Myungcheol Doo
2008 Proceedings of the 2008 ACM SIGMOD international conference on Management of data - SIGMOD '08  
To that end, Spade employs a code generation framework to create highly-optimized applications that run natively on the Stream Processing Core (SPC), the execution and communication substrate of System  ...  Spade's optimizing compiler automatically maps applications into appropriately sized execution units in order to minimize communication overhead, while at the same time exploiting available parallelism  ...  We thank Olivier Verscheure and Deepak Turaga for being the early adopters of Spade and using it for the implementation of real-world System S pilots.  ... 
doi:10.1145/1376616.1376729 dblp:conf/sigmod/GedikAWYD08 fatcat:ti4umlzudfczjbodajet2b5wze
« Previous Showing results 1 — 15 out of 1,309 results