219 Hits in 6.8 sec

Performance Model of MapReduce Iterative Applications for Hybrid Cloud Bursting

Francisco J. Clemente-Castello, Bogdan Nicolae, Rafael Mayo, Juan Carlos Fernandez
2018 IEEE Transactions on Parallel and Distributed Systems  
To this end, the current paper contributes with a performance model and methodology to estimate the runtime of iterative MapReduce applications in a hybrid cloud bursting scenario.  ...  data analytics, especially for iterative applications.  ...  Performance Model In this section, we introduce a performance model that enables users to estimate the runtime of iterative MapReduce applications in hybrid cloud bursting scenarios.  ... 
doi:10.1109/tpds.2018.2802932 fatcat:fjssmetdcjb3bem6jkqkq4owuy

Evaluation of Data Locality Strategies for Hybrid Cloud Bursting of Iterative MapReduce

Francisco J. Clemente-Castello, Bogdan Nicolae, M. Mustafa Rafique, Rafael Mayo, Juan Carlos Fernandez
2017 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)  
In this paper we study how to combine various MapReduce data locality techniques designed for hybrid cloud bursting in order to achieve scalability for iterative MapReduce applications in a cost-effective  ...  We show that using the right combination of techniques, iterative MapReduce applications can scale well in a hybrid cloud bursting scenario and come even close to the scalability observed in single sites  ...  This paper has studied the feasibility of hybrid cloud bursting for iterative MapReduce applications.  ... 
doi:10.1109/ccgrid.2017.96 dblp:conf/ccgrid/Clemente-Castello17 fatcat:julzxysp75hwvmpizgs5ezjaqe

On exploiting data locality for iterative mapreduce applications in hybrid clouds

Francisco J. Clemente-Castelló, Bogdan Nicolae, Rafael Mayo, Juan Carlos Fernández, M. Mustafa Rafique
2016 Proceedings of the 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies - BDCAT '16  
Specifically, we focus our study on iterative MapReduce applications, which are a class of large-scale data intensive applications particularly popular on hybrid clouds.  ...  Hybrid cloud bursting (i.e., leasing temporary off-premise cloud resources to boost the capacity during peak utilization), has made significant impact especially for big data analytics, where the explosion  ...  This paper has contributed with an analysis of the challenges that arise when running iterative MapReduce applications in hybrid cloud bursting.  ... 
doi:10.1145/3006299.3006329 dblp:conf/bdc/Clemente-Castello16 fatcat:3xlkkjny6zajtk54p5j2rqfsbu

Enabling Big Data Analytics in the Hybrid Cloud Using Iterative MapReduce

Francisco J. Clemente-Castello, Bogdan Nicolae, Kostas Katrinis, M. Mustafa Rafique, Rafael Mayo, Juan Carlos Fernandez, Daniela Loreti
2015 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing (UCC)  
This paper addresses this gap by taking on the challenge of bursting over hybrid clouds for the benefit of accelerating iterative MapReduce applications.  ...  In addition, we contribute with a performance prediction methodology that combines modeling with micro-benchmarks to estimate completion time for iterative MapReduce applications, which enables users to  ...  Methodology to leverage the hybrid performance model In order to make use of the hybrid performance model introduced above for actual predictions, we need to estimate the hybrid rebalance factors.  ... 
doi:10.1109/ucc.2015.47 dblp:conf/ucc/Clemente-Castello15 fatcat:wmo6c3utunbmbdajulzo5cbtom

A Survey of Big Data Machine Learning Applications Optimization in Cloud Data Centers and Networks [article]

Sanaa Hamid Mohamed, Taisir E.H. El-Gorashi, Jaafar M.H. Elmirghani
2019 arXiv   pre-print
The MapReduce programming model and its widely-used open-source platform; Hadoop, are enabling the development of a large number of cloud-based services and big data applications.  ...  In this survey, we present a summary of the characteristics of various big data programming models and applications and provide a review of cloud computing infrastructures, and related technologies such  ...  All data are provided in full in the results section of this paper.  ... 
arXiv:1910.00731v1 fatcat:kvi3br4iwzg3bi7fifpgyly7m4

Floe: A Continuous Dataflow Framework for Dynamic Cloud Applications [article]

Yogesh Simmhan, Alok Kumbhare
2014 arXiv   pre-print
Floe is a continuous dataflow framework that is designed to be adaptive for dynamic applications on Cloud infrastructure.  ...  It offers advanced dataflow patterns like BSP and MapReduce for flexible and holistic composition of streams and files, and supports dynamic recomposition at runtime with minimal impact on the execution  ...  The hybrid model performs better than the static model and finishes processing the messages within the given tolerance while using less resources than the dynamic model.  ... 
arXiv:1406.5977v1 fatcat:tfxojvrp7jesjlbau6xeq62bfa

Data-Intensive Cloud Computing: Requirements, Expectations, Challenges, and Solutions

Jawwad Shamsi, Muhammad Ali Khojaye, Mohammad Ali Qasmi
2013 Journal of Grid Computing  
It provides a detailed study of the solutions and analyzes their capabilities in meeting emerging needs of widespread applications.  ...  A data-intensive cloud provides an abstraction of high availability, usability, and efficiency to users.  ...  The combination of column oriented and row-oriented approaches leads Cassandra to a hybrid model for data storage and management.  ... 
doi:10.1007/s10723-013-9255-6 fatcat:l27ga4kh7nhnjd6nb6n57autgq

A survey of cloud-based network intrusion detection analysis

Nathan Keegan, Soo-Yeon Ji, Aastha Chaudhary, Claude Concolato, Byunggu Yu, Dong Hyun Jeong
2016 Human-Centric Computing and Information Sciences  
We offer a current overview of this growing body of research, highlighting successes, challenges, and future directions for MLA-usage in cloud-based network intrusion detection approaches.  ...  Since the dawn of computer networking, intrusion detection systems (IDSes) have played a critical role in ensuring safe networks for all users, but the shape of the role has changed throughout recent history  ...  A possible implementation of MapReduce for a cloud-based intrusion detection technique is inherently simple.  ... 
doi:10.1186/s13673-016-0076-z fatcat:anffwy2svvcbbafj7qrfb3z2ay

Building and scaling virtual clusters with residual resources from interactive clouds

R. Benjamin Clay, Zhiming Shen, Xiaosong Ma
2013 Proceedings of the 22nd international symposium on High-performance parallel and distributed computing - HPDC '13  
RHIC builds adhoc clusters for running throughput-oriented "background" workloads using a hybrid of residual and dedicated resources.  ...  RHIC employs black-box workload performance modeling, requiring only system-level metrics and incorporating techniques to improve modeling accuracy under bursty and heterogeneous residual resources.  ...  Hybrid MapReduce, Volunteerism and Cluster Sharing. Prior works use Amazon EC2 Spot Instances to perform MapReduce jobs [9, 21, 27] , whose transience is similar to interactive cloud nodes.  ... 
doi:10.1145/2493123.2462927 fatcat:h6kgbrs6c5cwhnjjfkxys4pcxu


2020 International journal of engineering sciences & research technology  
) model for forecasting of rainfall or cloud burst based on the previous record of the bursting in different state.  ...  So, in this research we focus to develop a model using the optimized Artificial Neural Network (ANN) for prediction of cloud bursting in India and developed model in known as Cloud Burst Foresting (CBF  ...  Authors address the problem of how to estimate the runtime of iterative MapReduce applications in hybrid cloud bursting scenarios where on premise and off-premise monitoring that host a MapReduce environment  ... 
doi:10.29121/ijesrt.v9.i10.2020.14 fatcat:ckvahm5pmvhbznwfm54upd62l4

Data Processing Model to Perform Big Data Analytics in Hybrid Infrastructures

Julio C. S. Anjos, Kassiano J. Matteussi, Paulo R. R. De Souza, Gabriel J. A. Grabher, Guilherme A. Borges, Jorge L. V. Barbosa, Gabriel V. Gonzalez, Valderi R. Q. Leithardt, Claudio F. R. Geyer
2020 IEEE Access  
HyMR [30] is a framework for enabling an autonomic Cloud burst for clusters of virtual machines that executes MR jobs over a Multi-Cloud.  ...  The data locality and data movement remain a challenge for accelerating iterative MR in HC once iterative applications reuse invariant input data. Clement et al.  ... 
doi:10.1109/access.2020.3023344 fatcat:dmqifexpivhnld75k3uwbmyaui

Accelerating Batch Analytics with Residual Resources from Interactive Clouds

R. Benjamin Clay, Zhiming Shen, Xiaosong Ma
2013 2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems  
RHIC builds ad-hoc clusters for running throughput-oriented "background" workloads using a hybrid of residual and dedicated resources.  ...  RHIC employs blackbox workload performance modeling, requiring only systemlevel metrics and incorporating techniques to improve modeling accuracy with bursty and heterogeneous residual resources.  ...  Any opinions expressed in this paper are those of the authors and do not necessarily reflect the views of the NSF or U.S. Government.  ... 
doi:10.1109/mascots.2013.63 dblp:conf/mascots/ClaySM13 fatcat:ksz5rdsbnna5rjlkc2ubd2n6fi

Chiminey: Reliable Computing and Data Management Platform in the Cloud [article]

Iman I. Yusuf and Ian E. Thomas and Maria Spichkova and Steve Androulakis and Grischa R. Meyer and Daniel W. Drumm and George Opletal and Salvy P. Russo and Ashley M. Buckle and Heinz W. Schmidt
2015 arXiv   pre-print
We present here Chiminey, a software platform that enables researchers to (i) run applications on both traditional high-performance computing and cloud-based computing infrastructures, (ii) handle failure  ...  The enabling of scientific experiments that are embarrassingly parallel, long running and data-intensive into a cloud-based execution environment is a desirable, though complex undertaking for many researchers  ...  The MapReduce computation is performed iteratively until the predefined criterion is met.  ... 
arXiv:1507.01321v1 fatcat:hoirxinvqnfqjm3ppy3b2rdwzy

Big Data computing and clouds: Trends and future directions

Marcos D. Assunção, Rodrigo N. Calheiros, Silvia Bianchi, Marco A.S. Netto, Rajkumar Buyya
2015 Journal of Parallel and Distributed Computing  
This paper discusses approaches and environments for carrying out analytics on Clouds for Big Data applications.  ...  It revolves around four important areas of analytics and Big Data, namely (i) data management and supporting architectures; (ii) model development and scoring; (iii) visualisation and user interaction;  ...  A hybrid Cloud is used to speed up the application execution.  ... 
doi:10.1016/j.jpdc.2014.08.003 fatcat:l4d5t2y4hrhg5irbyk7bd6zfo4

Time and Cost Sensitive Data-Intensive Computing on Hybrid Clouds

Tekin Bicer, David Chiu, Gagan Agrawal
2012 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)  
In this paper, we describe a modeling-driven resource allocation framework to support both time and cost sensitive execution for data-intensive applications executed in a hybrid cloud setting.  ...  Within local clusters, competition for resources complicates applications with deadlines.  ...  We developed a model for the class of Map-Reducible applications which captures the performance efficiencies and the projected costs for the allocated cloud resources.  ... 
doi:10.1109/ccgrid.2012.95 dblp:conf/ccgrid/BicerCA12 fatcat:mmq5kxt4uzef5dqlmexnascv3i
« Previous Showing results 1 — 15 out of 219 results