1,000 Hits in 6.8 sec

On exploiting data locality for iterative mapreduce applications in hybrid clouds

Francisco J. Clemente-Castelló, Bogdan Nicolae, Rafael Mayo, Juan Carlos Fernández, M. Mustafa Rafique
2016 Proceedings of the 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies - BDCAT '16  
Specifically, we focus our study on iterative MapReduce applications, which are a class of large-scale data intensive applications particularly popular on hybrid clouds.  ...  Hybrid cloud bursting (i.e., leasing temporary off-premise cloud resources to boost the capacity during peak utilization), has made significant impact especially for big data analytics, where the explosion  ...  This work focuses on iterative MapReduce applications running on hybrid clouds, where the data is initially onpremise only.  ... 
doi:10.1145/3006299.3006329 dblp:conf/bdc/Clemente-Castello16 fatcat:3xlkkjny6zajtk54p5j2rqfsbu

Enabling Big Data Analytics in the Hybrid Cloud Using Iterative MapReduce

Francisco J. Clemente-Castello, Bogdan Nicolae, Kostas Katrinis, M. Mustafa Rafique, Rafael Mayo, Juan Carlos Fernandez, Daniela Loreti
2015 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing (UCC)  
This paper addresses this gap by taking on the challenge of bursting over hybrid clouds for the benefit of accelerating iterative MapReduce applications.  ...  We show through experimentation in a dual-Openstack hybrid cloud setup that our solutions manage to bring substantial improvement at predictable cost-control for two real-life iterative MapReduce applications  ...  We focus on one specific class of big data analytics applications that is particularly suitable for "hybrid cloud big data analytics": iterative applications that reuse invariant input data.  ... 
doi:10.1109/ucc.2015.47 dblp:conf/ucc/Clemente-Castello15 fatcat:wmo6c3utunbmbdajulzo5cbtom

Evaluation of Data Locality Strategies for Hybrid Cloud Bursting of Iterative MapReduce

Francisco J. Clemente-Castello, Bogdan Nicolae, M. Mustafa Rafique, Rafael Mayo, Juan Carlos Fernandez
2017 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)  
In this paper we study how to combine various MapReduce data locality techniques designed for hybrid cloud bursting in order to achieve scalability for iterative MapReduce applications in a cost-effective  ...  We show that using the right combination of techniques, iterative MapReduce applications can scale well in a hybrid cloud bursting scenario and come even close to the scalability observed in single sites  ...  Scheduling strategies for Hadoop MapReduce applications running on hybrid clouds have been proposed in numerous studies.  ... 
doi:10.1109/ccgrid.2017.96 dblp:conf/ccgrid/Clemente-Castello17 fatcat:julzxysp75hwvmpizgs5ezjaqe

Performance Model of MapReduce Iterative Applications for Hybrid Cloud Bursting

Francisco J. Clemente-Castello, Bogdan Nicolae, Rafael Mayo, Juan Carlos Fernandez
2018 IEEE Transactions on Parallel and Distributed Systems  
While there are several data locality techniques dedicated for big data bursting on hybrid clouds, their effectiveness is difficult to estimate in advance.  ...  data analytics, especially for iterative applications.  ...  TECHNIQUES TO LEVERAGE DATA LOCALITY FOR ITERATIVE MAPREDUCE In this section we present two complementary techniques to improve data locality for hybrid cloud bursting.  ... 
doi:10.1109/tpds.2018.2802932 fatcat:fjssmetdcjb3bem6jkqkq4owuy

Hybrid cloud and cluster computing paradigms for life science applications

Judy Qiu, Jaliya Ekanayake, Thilina Gunarathne, Jong Choi, Seung-Hee Bae, Hui Li, Bingjing Zhang, Tak-Lon Wu, Yang Ruan, Saliya Ekanayake, Adam Hughes, Geoffrey Fox
2010 BMC Bioinformatics  
However they have limited applicability to some areas such as data mining because MapReduce has poor performance on problems with an iterative structure present in the linear algebra that underlies much  ...  Clouds and MapReduce have shown themselves to be a broadly useful approach to scientific computing especially for parallel data intensive applications.  ...  Acknowledgements We appreciate Microsoft for their technical support.  ... 
doi:10.1186/1471-2105-11-s12-s3 pmid:21210982 pmcid:PMC3040529 fatcat:vjbihezanrdf5dd7ezved45pyq

A Combined Analytical Modeling Machine Learning Approach for Performance Prediction of MapReduce Jobs in Cloud Environment

Ehsan Ataie, Eugenio Gianniti, Danilo Ardagna, Ali Movaghar
2016 2016 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)  
In this work, we propose and validate a hybrid approach exploiting both queuing networks and support vector regression, in order to achieve a good accuracy without too many costly experiments on a real  ...  Nowadays MapReduce and its open source implementation, Apache Hadoop, are the most widespread solutions for handling massive dataset on clusters of commodity hardware.  ...  of MapReduce jobs in a cloud cluster.  ... 
doi:10.1109/synasc.2016.072 dblp:conf/synasc/AtaieGAM16 fatcat:udouh7sgmjadzj57jf6p7j2xii

Data mining algorithms as a service in the cloud exploiting relational database systems

Carlos Ordonez, Javier García-García, Carlos Garcia-Alvarado, Wellington Cabrera, Veerabhadran Baladandayuthapani, Mohammed S. Quraishi
2013 Proceedings of the 2013 international conference on Management of data - SIGMOD '13  
Unlike other analytic systems, our solution is not based on MapReduce.  ...  complex methods as a task for the cloud DBMS.  ...  We thank Rogelio Montero-Campos, Julio Vega and José Luis Alvarez, from UNAM University, for helping in the development of our system.  ... 
doi:10.1145/2463676.2465240 dblp:conf/sigmod/OrdonezGGCBQ13 fatcat:nahqsy6qdbgazj3ee5et427bgm

BIGhybrid -- A Toolkit for Simulating MapReduce in Hybrid Infrastructures

Julio C.S. dos Anjos, Gilles Fedak, Claudio F.R. Geyer
2014 2014 International Symposium on Computer Architecture and High Performance Computing Workshop  
Cloud computing has increasingly been used as a platform for running large business and data processing applications.  ...  Merging cloud computing and desktop grids into a hybrid infrastructure can provide a feasible low-cost solution for big data analysis.  ...  The experiments discussed in this paper were conducted with the aid of the Grid'5000 experimental testbed, under the INRIA ALADDIN development plan with support from CNRS, RENATER and a number of universities  ... 
doi:10.1109/sbac-padw.2014.8 dblp:conf/sbac-pad/AnjosFG14 fatcat:2iz4zeicovatzfnz6fc3y64rpa

Scalable parallel computing on clouds using Twister4Azure iterative MapReduce

Thilina Gunarathne, Bingjing Zhang, Tak-Lon Wu, Judy Qiu
2013 Future generations computer systems  
Recent advances in data intensive computing for science discovery are fueling a dramatic growth in the use of dataintensive iterative computations.  ...  We also study and present solutions to several factors that affect the performance of iterative MapReduce applications on Windows Azure Cloud.  ...  These applications running on the Azure Cloud exhibited performance comparable to the Apache Hadoop on a dedicated local cluster.  ... 
doi:10.1016/j.future.2012.05.027 fatcat:7555zbco7rggvhvpvswuvsnteu

State Space Exploration of RT Systems in the Cloud [article]

Carlo Bellettini, Matteo Camilli, Lorenzo Capra, Mattia Monga
2012 arXiv   pre-print
The growing availability of distributed and cloud computing frameworks make it possible to face complex computational problems in a more effective and convenient way.  ...  We present and compare two different approaches to state-space explosion, relying on distributed and cloud frameworks, respectively.  ...  When the expansion front exceeds T , an Iterative MapReduce model on a large cluster of machines is employed. We call this approach (sketched in Fig. 6 ) Hybrid Iterative MapReduce (himapred).  ... 
arXiv:1203.6806v1 fatcat:z2noflaufjd5dnadpqyni5dwpy

Big Data Processing in Cloud Computing Environments

Changqing Ji, Yu Li, Wenming Qiu, Uchechukwu Awada, Keqiu Li
2012 2012 12th International Symposium on Pervasive Systems, Algorithms and Networks  
Finally, we discuss the open issues and challenges, and deeply explore the research directions in the future on big data processing in cloud computing environments.  ...  Following the MapReduce parallel processing framework, we then introduce MapReduce optimization strategies and applications reported in the literature.  ...  Recently, many research efforts have exploited the MapReduce framework for solving challenging data processing problems on large scale datasets in different domains.  ... 
doi:10.1109/i-span.2012.9 fatcat:5nk7w7xdlzbe7dlxb2mqq7fne4

Review on the Cloud Computing Programming Model

Chao Shen, Weiqin Tong
2014 International Journal of Advanced Science and Technology  
Cloud computing integrates vast computing and/or storage resources together, which provides services on demand via network.  ...  The cloud computing data center is usually composed of thousand of commercial computers, and these computers are connected by network.  ...  Pregel is similar in concept to MapReduce but much more efficient support for iterative computations over the graph. Hama [16] provides BSP library based on the Hadoop framework.  ... 
doi:10.14257/ijast.2014.70.02 fatcat:etcuk75oczgabewmzo62sokii4

A Comparative Study of Association Rule Mining Algorithms on Grid and Cloud Platform [article]

Sudhakar Singh, Rakhi Garg, P. K. Mishra
2017 arXiv   pre-print
Grid and cloud are the emerging platform for distributed data processing and various association rule mining algorithms have been proposed on such platforms.  ...  We differentiate between approaches of association rule mining algorithms developed on these architectures on the basis of data locality, programming paradigm, fault tolerance, communication cost, partition  ...  MapReduce is a simplified programming model used in cloud computing for processing large volume of data sets [26] , [40] .  ... 
arXiv:1709.07594v1 fatcat:6jneovykqvfq5ja6vmds3akziy

Data-Intensive Cloud Computing: Requirements, Expectations, Challenges, and Solutions

Jawwad Shamsi, Muhammad Ali Khojaye, Mohammad Ali Qasmi
2013 Journal of Grid Computing  
This paper analyzes the extensive requirements which exist in data-intensive clouds, describes various challenges related to the paradigm, and assess numerous solutions in meeting these requirements and  ...  It provides a detailed study of the solutions and analyzes their capabilities in meeting emerging needs of widespread applications.  ...  In the data-intensive world we live, requirements and challenges also vary with applications, For example, an iterative application such as page-rank computation algorithm requires iterative computation  ... 
doi:10.1007/s10723-013-9255-6 fatcat:l27ga4kh7nhnjd6nb6n57autgq

Enabling Cloud Interoperability with COMPSs [chapter]

Fabrizio Marozzo, Francesc Lordan, Roger Rafanell, Daniele Lezzi, Domenico Talia, Rosa M. Badia
2012 Lecture Notes in Computer Science  
The framework has been evaluated through the porting of a data mining workflow to COMPSs and the execution on an hybrid testbed.  ...  On the other side, such ability, has opened new challenges for the execution of their computational work and the managing of massive amounts of data into resources provided by different private and public  ...  This work was also made possible using the computing use grant provided by Microsoft in the VENUS-C project.  ... 
doi:10.1007/978-3-642-32820-6_4 fatcat:3mdsmk3kpzh5rjleqc2tthltri
« Previous Showing results 1 — 15 out of 1,000 results