A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Mix 'n' match multi-engine analytics
2016
2016 IEEE International Conference on Big Data (Big Data)
As a remedy, we present IReS, the Intelligent Resource Scheduler for complex analytics workflows executed over multi-engine environments. ...
Its optimizer incurs only marginal overhead to the workflow execution performance, managing to discover the optimal execution plan within a few seconds, even for large-scale workflow instances. ...
The central notion behind IReS is to utilize detailed models of the costs and performance characteristics of analytics operators over multiple execution engines. ...
doi:10.1109/bigdata.2016.7840605
dblp:conf/bigdataconf/DokaPGTK16
fatcat:nimpr3poqfhapgxssfoulvva6u
IReS
2015
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data - SIGMOD '15
To this end, we demonstrate IReS, the Intelligent Resource Scheduler for complex analytics workflows executed over multi-engine environments. ...
IReS is then able to match distinct workflow parts to the execution and/or storage engine among the available ones in order to optimize with respect to a user-defined policy. ...
The central notion behind the IReS platform is to create detailed models of the costs and performance characteristics of various analytics operations over multiple execution engines. ...
doi:10.1145/2723372.2735377
dblp:conf/sigmod/DokaPTMK15
fatcat:5dyeik3d45esje2mbqmwravm4q
SheerMP: Optimized Streaming Analytics-as-a-Service over Multi-site Multi-platform Settings
2022
Zenodo
In this paper, we demonstrate a prototype system that optimizes streaming analytics workflows across Big Data platforms and computer clusters. ...
a wide variety of practical optimization and adaptive resource allocation scenarios over a variety of streaming Big Data platforms ...
SheerMP automates optimization decisions, submits and migrates streaming analytics workflows, and monitors their execution over a variety of streaming Big Data platforms. ...
doi:10.5281/zenodo.6345356
fatcat:6wwzu3ijgfdzbfzb7grz25xlkm
The Case for Multi-Engine Data Analytics
[chapter]
2014
Lecture Notes in Computer Science
Such an environment further requires an intelligent management system for orchestrating and coordinating complex analytics tasks over the different available engines. ...
In this paper we argue on the need of a multi-engine environment that will exploit the largely different models, cost and quality of the existing analytics engines. ...
Modeling and Learning Engine In order for the scheduler to choose an optimized execution plan for an analytics task that will span (a) multiple execution engines and (b) multiple data stores, a detailed ...
doi:10.1007/978-3-642-54420-0_40
fatcat:4yd56fw575eptds4kohxkeqzqe
D4.1 Definition of Architecture for Extreme-Scale Analytics
2019
Zenodo
physical resources in a way that optimizes specific performance measures, (iii) providing real-time, interactive machine learning and data mining tools that can be leveraged by the designed workflows, ...
to an omnibus solution for extreme-scale streaming analytics. ...
Once a workflow is sent to the Optimizer, the Optimizer enumerates the space of possible and promising execution plans for the workflow and estimates plan costs using a dynamic cost model that predicts ...
doi:10.5281/zenodo.4034092
fatcat:g766jj6xwvesddsm3xs56l6mqq
Stubby: A Transformation-based Optimizer for MapReduce Workflows
[article]
2012
arXiv
pre-print
However, automatic cost-based optimization of MapReduce workflows remains a challenge due to the multitude of interfaces, large size of the execution plan space, and the frequent unavailability of all ...
Studies have shown that the gap in performance can be quite large between optimized and unoptimized workflows. ...
Since many analytical workflows are run periodically, the optimization overhead of Stubby can be amortized over multiple workflow runs. ...
arXiv:1208.0082v1
fatcat:2qzmt6psizbylogpnqhm7tsyrq
The Many Faces of Data-centric Workflow Optimization: A Survey
[article]
2017
arXiv
pre-print
Firstly, to present the main dimensions of the relevant optimization problems and the types of optimizations that occur before flow execution. ...
This survey focuses on data-centric workflows (or workflows for data analytics or data flows), where a key aspect is data passing through and getting manipulated by a sequence of steps. ...
, such as the workflow monitoring and data provision components; iii) workflow execution plan (WEP) generation, where the workflow plan is optimized, e.g., through workflow refactoring and parallelization ...
arXiv:1701.07723v1
fatcat:fasmrggxfzb33ckcookphwdve4
Odyssey
2013
Proceedings of the VLDB Endowment
Acknowledgment: We thank NEC's product and business teams for their generous support and contributions. ...
The future phases of the system development plan to include additional execution engines, such as a columnar in-memory store. ...
Currently, the system uses two execution engines, namely; Hadoop and the Relational DW. ...
doi:10.14778/2536222.2536249
fatcat:xtkgwtevx5cg5dmnyha3ate3di
Design and Development of an Adaptive Workflow-Enabled Spatial-Temporal Analytics Framework
2012
2012 IEEE 18th International Conference on Parallel and Distributed Systems
In this paper, we present the architecture of such a WfMS and evaluate it in terms of performance for execution of workflows in Clouds. ...
Cloud computing is a suitable platform for execution of complex computational tasks and scientific simulations that are described in the form of workflows. ...
Our proposed architecture is able to (i) share workflows from multiple users for analytics, (ii) harness a workflow management and scheduling engine for adaptive resource allocation and optimization, ( ...
doi:10.1109/icpads.2012.141
dblp:conf/icpads/LiCLWPZB12
fatcat:b4hw676b6jalfo2fpet3lgnueu
Optimizing Resource Allocation for Scientific Workflows Using Advance Reservations
[chapter]
2010
Lecture Notes in Computer Science
The recent interest in web services and service-oriented architectures has strongly facilitated the development of individual workflow activities as well as their composition and the distributed execution ...
However, in many applications concurrent scientific workflows may be served by multiple competing providers, with each of them offering only limited resources. ...
Finally, it is worth mentioning that the workflow is scheduled to be run by two different workflow engines: Activities 1 through 8 are executed by WF-A, then control (and data) is handed over to WF-B for ...
doi:10.1007/978-3-642-13818-8_30
fatcat:gr3ivljlsfhktgyjejdv77ogji
Bandwidth Optimization In Data Retrieval From Cloud Using Continuous Hive Language
2016
International Journal Of Engineering And Computer Science
The proposed system optimizes query execution plans and data replication to minimize bandwidth cost. ...
Systems that compute SQL analytics over geographically distributed data operate by pulling all data to a central location. ...
Query Deployment engine is responsible for deploying generated Optimized Query Plan (OQP) onto the processing nodes in the network topology. ...
doi:10.18535/ijecs/v5i6.20
fatcat:cpptv3anancorjumcjvxuqyoba
D5.1 Operator Cost Estimation and Workflow Optimisation Technology V1
2020
Zenodo
This deliverable presents techniques for optimizing workflow execution in terms of a set of optimization objectives (e.g., throughput, resource utilization) of extreme-scale analytics across different, ...
It ingests statistics collected by the Manager Component to perform cost estimations and judge the performance of alternative execution plans i.e., the Optimizer Component transforms the logical workflow ...
The goal of the optimizer is to receive the initial workflow drawn by the user, i.e., a logical workflow, and optimize its execution over multiple, networked clusters, Big Data platforms and admissible ...
doi:10.5281/zenodo.4034108
fatcat:t22h4qqgjfbsporpl4zkf5c2qm
This is a direct consequence of the tight coupling between user-facing front-ends that express workflows (e.g., Hive, SparkSQL, Lindi, GraphLINQ) and the back-end execution engines that run them (e.g., ...
Musketeer speeds up realistic workflows by up to 9× by targeting different execution engines, without requiring any manual effort. ...
However, some execution engines have limited expressivity and therefore require the data-flow DAG to be partitioned into multiple jobs. ...
doi:10.1145/2741948.2741968
dblp:conf/eurosys/GogSCGCH15
fatcat:j67jem3ohzef7pwh2ymldjwehy
A multiple-objective workflow scheduling framework for cloud data analytics
2012
2012 Ninth International Conference on Computer Science and Software Engineering (JCSSE)
Our designed framework uses a meta-heuristics method, called Artificial Bee Colony (ABC), to create an optimized scheduling plan. The framework allows multiple constraints and objectives to be set. ...
In this paper, we proposed a workflow scheduling framework that can efficiently schedule series workflows with multiple objectives onto a cloud system. ...
The framework allows multiple objectives and constraints to be set in order to optimize the performance of data analytics workflow scheduling in cloud environments. ...
doi:10.1109/jcsse.2012.6261985
fatcat:2cb66vgxsvdyfnxliifonmlyya
Large-scale social-media analytics on stratosphere
2013
Proceedings of the 22nd International Conference on World Wide Web - WWW '13 Companion
that eases the formulation of complete analytical workflows. ...
Consequently, a wide range of analytics has been proposed to understand, steer, and exploit the mechanics and laws driving their functionality and creating the resulting benefits. ...
Acknowledgements We thank Christoph Nagel and Stephan Pieper (now with http://www.surpreso.com) for their implementation support while at TU Berlin and the Stratosphere team. ...
doi:10.1145/2487788.2487916
dblp:conf/www/BodenKFM13
fatcat:oom64pvgtrbobfb4hygyvi2i4u
« Previous
Showing results 1 — 15 out of 12,170 results