2,836 Hits in 6.4 sec

vHadoop: A Scalable Hadoop Virtual Cluster Platform for MapReduce-Based Parallel Machine Learning with Performance Consideration

Kejiang Ye, Xiaohong Jiang, Yanzhang He, Xiang Li, Haiming Yan, Peng Huang
2012 2012 IEEE International Conference on Cluster Computing Workshops  
This paper focuses on the performance of hadoop virtual cluster and proposes a scalable hadoop virtual cluster platform vHadoop for the large-scale MapReduce-based parallel data processing.  ...  migraiton of hadoop virtual cluster.  ...  In the static performance analysis, we mainly study the performance of crossdomain hadoop virtual cluster and the scalability of hadoop virtual cluster.  ... 
doi:10.1109/clusterw.2012.32 dblp:conf/cluster/YeJHLYH12 fatcat:fipeybcjp5hm3coeei5zpaxday

An Auto-Scaling Framework for Analyzing Big Data in the Cloud Environment

Rachana Jannapureddy, Quoc-Tuan Vien, Purav Shah, Ramona Trestian
2019 Applied Sciences  
Such a dynamic scaling method offers a reference to improving the Twitter data analysis in a more cost-effective and flexible way.  ...  In consideration of this, this paper investigates an auto-scaling framework on cloud environment aiming to minimise the cost of resource use by automatically adjusting the virtual nodes depending on the  ...  (R3) An efficient sentiment analysis should be performed via the auto-scaling cluster with low cost and short scaling time.  ... 
doi:10.3390/app9071417 fatcat:i6kovbm6zfaihlot6giwfrtzae

Survey on improved Autoscaling in Hadoop into cloud environments

Masoumeh Rezaei Jam, Leyli Mohammad Khanli, Mohammad Kazem Akbari, Elham Hormozi, Morteza Sargolzaei Javan
2013 The 5th Conference on Information and Knowledge Technology  
In this survey, we investigate some methods to improve scalability of Hadoop platform and Autoscaling of that.  ...  Because of that models and methods to design and analyze parallel processing of data is done automatically.  ...  Further to this, we gratefully acknowledge those in the cloud computing team at the Department of Computer engineering and Information Technology, Amirkabir University, IRAN and Cloud Computing lab in  ... 
doi:10.1109/ikt.2013.6620031 fatcat:3ra3zakwgrd2zaeelc3h3jtrqe


K. M. Kiran Raj, Krishna Prasad K
2022 Zenodo  
Using virtual components like VMware, vSphere and VMware and cloudera can be used to virtualize hadoop which also helps in optimization of both hardware and software tools.  ...  We will also discuss some of the tools which are used / provided for analysis of the Big Data for customer. Next part focuses on financial analysis and PEST analysis of the company.  ...  Using of virtualization and private cloud is cost effective and improves performance, minimal downtime, fault tolerance, flexibility and scalability.  ... 
doi:10.5281/zenodo.6568478 fatcat:tj6jqckhhzenjkc2eqabo4ecty

Towards a cost-efficient MapReduce: Mitigating power peaks for Hadoop clusters

Nan Zhu, Xue Liu, Jie Liu, Yu Hua
2014 Tsinghua Science and Technology  
Deploying and operating such systems require large amount of costs, including hardware costs to build clusters and energy costs to run clusters.  ...  In this paper, we take Hadoop as an example to illustrate the power peak problem which causes power inefficiency and provides in-depth analysis to explain issues with existing system designs.  ...  Acknowledgements Xue Liu would like to thank the support of the National Science Foundation of USA (No. 1116606).  ... 
doi:10.1109/tst.2014.6733205 fatcat:fboga6hydba5jkdtfsfs5hn4iy

Performance Evaluation of Hadoop in Cloud for Big Data

Mohammed Fakherldin, Ibrahim Aaker Targio Hashem, Abdullah Alzuabi, Faiz Alotaibi
2018 International Journal of Engineering & Technology  
This analysis has an impact on the processing in terms of execution time and cost of using each one of them.  ...  This paper provides a review and analysis of the impact of using physical versus cloud cluster in the processing a large amount of data.  ...  Overview of HADOOP Cluster Hadoop has offered a new alternative way to efficiently mining petabytes of unstructured information across multi-machines with lower cost commodity hardware.  ... 
doi:10.14419/ijet.v7i4.15.21363 fatcat:bulmfekcfrhuxjbkl4rvcaf5je

Performance of a Low Cost Hadoop Cluster for Image Analysis in Cloud Robotics Environment

Basit Qureshi, Yasir Javed, Anis Koubâa, Mohamed-Foued Sriti, Maram Alajlan
2016 Procedia Computer Science  
Furthermore, the performance of RPi-based clusters is extensively tested with different types of data including text, text/image and image, and a comparative analysis against Hadoop cluster running on  ...  Results show that the RPi Hadoop cluster lags in performance when compared to Hadoop cluster running on virtual machines, the low cost and small form factor makes it ideal for remote Image analysis in  ...  supported by the DroneMap project entitled DroneMap: A Cloud Robotics System for Unmanned Aerial Vehicles in Surveillance Applications under the grant number 35-157 from King AbdulAziz City for Science and  ... 
doi:10.1016/j.procs.2016.04.013 fatcat:ts7yklaswjciddqbwoz77enx2a

A Cost Effective Virtual Cluster with Hadoop Framework for Big Data Analytics

Seraj Al Mahmud Mostafa, A. B. M Moniruzzaman
2015 International Journal of Database Theory and Application  
This paper focuses on proposes a low cost scalable hadoop virtual cluster platform and the performance of hadoop irtual cluster.  ...  The contributions of this paper, is to design model and implement a cost effective elastic virtual data center with hadoop framework and resource utilizations for educational institutions to provide high  ...  This paper focuses on proposes a low cost scalable hadoop virtual cluster platform and the performance of hadoop virtual cluster.  ... 
doi:10.14257/ijdta.2015.8.6.18 fatcat:fcln3gay7fby5l2tc2mko72oua

Understanding Vertical Scalability of I/O Virtualization for MapReduce Workloads: Challenges and Opportunities [chapter]

Bogdan Nicolae
2014 Lecture Notes in Computer Science  
best practices for current approaches and speculate on future areas of improvement.  ...  One such important challenge relates to the limited scalability of I/O, a determining factor in the overall performance of big data applications.  ...  In a quest to keep up with scalability, paradigms such as MapReduce were specifically designed to decouple tasks and improve horizontal scalability of big data systems.  ... 
doi:10.1007/978-3-642-54420-0_1 fatcat:t7isdcjlsvbnlkdn7qio6ta5ma

Improving MapReduce Performance Using Smart Speculative Execution Strategy

Qi Chen, Cheng Liu, Zhen Xiao
2014 IEEE transactions on computers  
Experiment results show that MCP can run jobs up to 39 percent faster and improve the cluster throughput by up to 44 percent compared to Hadoop-0.21.  ...  We evaluate MCP in a cluster of 101 virtual machines running a variety of applications on 30 physical servers.  ...  This work was supported by the National Natural Science Foundation of China (Grant No. 61170056) and the National High Technology Research and Development Program ("863" Program) of China (Grant No. 2013AA013203  ... 
doi:10.1109/tc.2013.15 fatcat:6ojzh4sxo5amboxn4uqrbtgshy

Implementation of cost effective hierarchical Hadoop cluster–a case study for education

N S. Kalyan Chakravarthy, N Sudhakar, E Srinivasa Reddy
2018 International Journal of Engineering & Technology  
This case study proposes the Hierarchical Hadoop cluster to alter the way of using ICT.  ...  under hierarchical Hadoop cluster is needed.  ...  A cost-effective energy efficient, scalable Linux based Hadoop cluster is required to address these issues and to keep track of the usage of the ICT services provided by the government to the schools and  ... 
doi:10.14419/ijet.v7i2.21.12174 fatcat:2ynnn437pbcqder7xkn3ptrlhq

Enhancing the performance of distributed big data processing systems using Hadoop and Polybase

Sergii Minukhin, Victor Fedko, Yurii Gnusov
2018 Eastern-European Journal of Enterprise Technologies  
In addition, assessment of performance of homogeneous and heterogeneous Hadoop cluster are required when working at local resources with the tools of VDI (virtual desktop infrastructure), as well as on  ...  With an increase in the number of elements in the chain and the number of computers in a cluster, there occurs a problem, associated with scalability of the architecture.  ... 
doi:10.15587/1729-4061.2018.139630 fatcat:ahweuwgyyffnrl4w5q3crvmeky

Cloud Computing Enabled Big Multi-Omics Data Analytics

Saraswati Koppad, Annappa B, Georgios V Gkoutos, Animesh Acharjee
2021 Bioinformatics and Biology Insights  
High-throughput experiments enable researchers to explore complex multifactorial diseases through large-scale analysis of omics data.  ...  Recent innovations in computational technologies and approaches, especially in cloud computing, offer a promising, low-cost, and highly flexible solution in the bioinformatics domain.  ...  Acknowledgements We would like to acknowledge the reviewers and the editor for very useful constructive feedback.  ... 
doi:10.1177/11779322211035921 fatcat:7bk7zvxvb5hurhyyu5knuvgqeq

Experimental Setup of Logs Analysis on Distributed File Systems using MapReduce

Madhavi Vaidya, Shrinivas Deshpande
2017 Indian Journal of Science and Technology  
These DFS installed on infrastructures like Single Virtual Machine, a cluster of Virtual Machine and the minicloud.  ...  These machines can be of different configurations or using virtual machines on a shared LAN to communicate with each other.  ...  Acknowledgement We would like to thank the Head of the institution for granting permission to make use of the web security logs generated from the institution's server. References  ... 
doi:10.17485/ijst/2017/v10i29/116504 fatcat:h7eb5tvse5gnverw577734uaga

Enabling Large-Scale Biomedical Analysis in the Cloud

Ying-Chih Lin, Chin-Sheng Yu, Yen-Jen Lin
2013 BioMed Research International  
These developments and applications would facilitate biomedical research to make the vast amount of diversification data meaningful and usable.  ...  Cloud computing is an alternative to crack the nut because it gives concurrent consideration to enable storage and high-performance computing on large-scale data.  ...  Conflict of Interests The authors declare that there is no conflict of interests regarding the publication of this paper.  ... 
doi:10.1155/2013/185679 pmid:24288665 pmcid:PMC3832998 fatcat:zi7vkjnqczbx3gydrgbdbbmwhe
« Previous Showing results 1 — 15 out of 2,836 results