2,905 Hits in 5.4 sec

Optimizing the performance of hadoop clusters through efficient cluster management techniques

K S. Shraddha Bollamma, S Manishankar, M V. Vishnu
2018 International Journal of Engineering & Technology  
Thus this integrated platform helps in monitoring the distributed cluster as well as improving the performance of the overall Big Data processing.  ...  Zoo keeper does the monitoring of cluster nodes of the distributed system and identifies critical performance problems.  ...  To extract the value of a data, big data is applied to data analysis methods. The basic characteristics of Big Data are 3v that is velocity, volume and variety.  ... 
doi:10.14419/ijet.v7i2.31.13389 fatcat:ingdeolg55hv7cauyjjtpa7we4

Research of Performance of Distributed Platforms Based on Clustering Algorithm

Di Jian, Yanfeng Peng
2016 Journal of Computers  
find out that the match of Spark and YARN shows more effective on clustering results and consumes less time on the execution of programs, so it's more suitable for cluster analysis of big data.  ...  This paper is based on the parallelization of platforms' K-means algorithm, by building a YARN cluster environment and making experiments to analyze performance of two distributed platforms, and finally  ...  Experiment and Analysis Experiment 1: Comparison and analysis of distributed platforms' performance.  ... 
doi:10.17706/jcp.11.3.195-200 fatcat:fcoak5zvxzaxzcdq7tdg3beeum


2018 TASK Quarterly  
Several Big Data platforms like Apache Spark or YARN have become widely used in analytics and High-Performance Computing systems due to the reliability and usability of Map Reduce implementations.  ...  Climate change caused by human activities can influence the lives of everybody on the planet. The environmental concerns must be taken into consideration by all fields of study including ICT.  ...  Multiple platforms (such as Apache Hadoop [5] or Apache Spark [6] etc.) that are able to ease the Big Data processing by distributing both data and computations between the nodes in the cluster have  ... 
doi:10.17466/tq2018/22.4/c doaj:df22cde55b914155b34d8d6432945f32 fatcat:cp53shhno5f5tg6t6c74c62jve

Privacy-Aware Data Forensics of VRUs Using Machine Learning and Big Data Analytics

Muhammad Babar, Muhammad Usman Tariq, Ahmed S. Almasoud, Mohammad Dahman Alshehri, Farhan Ullah
2021 Security and Communication Networks  
In this study, an architecture is proposed based on machine learning to analyze and process big data efficiently in a secure environment.  ...  The proposed architecture is a layered framework with a parallel and distributed module using machine learning on big data to achieve secure big data analytics.  ...  Task distribution in the cluster is carried out using the YARN distributed cluster management framework. e YARN is equipped with dynamic programming for task distribution and cluster management. e previous  ... 
doi:10.1155/2021/3320436 fatcat:zqwggcym6jdargaia4jd3ngazm

A Comparative Study on Streaming Frameworks for Big Data

Wissem Inoubli, Sabeur Aridhi, Haithem Mezni, Mondher Maddouri, Engelbert Mephu Nguifo
2018 Very Large Data Bases Conference  
In this paper, we discuss the challenges of Big Data and we survey existing streaming frameworks for Big Data.  ...  Yet, many research works focus on streaming in Big Data, a task referring to the processing of massive volumes of structured/unstructured streaming data.  ...  Acknowledgements This research was partially supported by the General Direction of Scientific Research in Tunisia (DGRST).  ... 
dblp:conf/vldb/InoubliAMMN18 fatcat:pjb6jwacardhhenoep5aqs3tse

IoT-Enabled Big Data Analytics Architecture for Multimedia Data Communications

Muhammad Babar, Mohammad Dahman Alshehri, Muhammad Usman Tariq, Fasee Ullah, Atif Khan, M. Irfan Uddin, Ahmed S. Almasoud, Deepak Gupta
2021 Wireless Communications and Mobile Computing  
The proposed architecture is a layered architecture integrated with a parallel and distributed module to accomplish big data analytics for multimedia data.  ...  There are several challenges in the existing structural design of the IoT-enabled data management systems to handle MMBD including high-volume storage and processing of data, data heterogeneity due to  ...  Acknowledgments This study was supported by Taif University Researchers Supporting Project (number TURSP-2020/126), Taif University, Taif, Saudi Arabia.  ... 
doi:10.1155/2021/5283309 fatcat:eeycgvk6inblhopeukijwlvr2y

Performance Evaluation of Hadoop in Cloud for Big Data

Mohammed Fakherldin, Ibrahim Aaker Targio Hashem, Abdullah Alzuabi, Faiz Alotaibi
2018 International Journal of Engineering & Technology  
This paper provides a review and analysis of the impact of using physical versus cloud cluster in the processing a large amount of data.  ...  Hadoop is one of the most popular platforms for big data, thus, Hadoop MapReduce is used to store data in Hadoop distributed file systems.  ...  YARN: Hadoop Yarn is a framework, which provides a management solution for big data in distributed environments [10] .  ... 
doi:10.14419/ijet.v7i4.15.21363 fatcat:bulmfekcfrhuxjbkl4rvcaf5je

A Comparative Analysis of Big Data Frameworks: An Adoption Perspective

Madiha Khalid, Muhammad Murtaza Yousaf
2021 Applied Sciences  
We identify key requirements of a big framework and review each of these frameworks in the perspective of those requirements.  ...  of data.  ...  Data Availability Statement: Not applicable. Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/app112211033 fatcat:mfh3thwe5ngdnolkhc264rkbdi

A Review of Weather Data Analytics using Big Data

Priyanka Chouksey, Abhishek Singh Chauhan
2017 IJARCCE  
Big Data Analytics technology like Hadoop MapReduce and Spark are playing great role in handling huge amount of data.  ...  This project aims to study the analytics of weather data using MapReduce and Spark.  ...  Spark with in memory computing also gives very good performance for analysis of unstructured data.  ... 
doi:10.17148/ijarcce.2017.6172 fatcat:wdr6fmulejbkpkyct2vp6mhucy

Apache Hadoop-MapReduce on YARN framework latency

Abdelaziz EL YAZIDI, Mohamed Saad AZIZI, Yassine BENLACHMI, Moulay Lahcen HASNAOUI
2021 Procedia Computer Science  
In this paper, we discuss the challenges of Big Data and we are going to present the famous Apache Hadoop framework and we are going to test the latency at the level of execution of a MapReduce job based  ...  In this paper, we discuss the challenges of Big Data and we are going to present the famous Apache Hadoop framework and we are going to test the latency at the level of execution of a MapReduce job based  ...  Related work In this study that we did, we studied the performance of framework Hadoop MapReduce in relation to the size of the files used by a client, namely in our case, we noticed the speed of processing  ... 
doi:10.1016/j.procs.2021.03.100 fatcat:27e6aqmr3redncgmmvb7c2ku6a

BigDataNetSim: A Simulator for Data and Process Placement in Large Big Data Platforms

Leandro Batista de Almeida, Eduardo Cunha de Almeida, John Murphy, Robson E. De Grande, Anthony Ventresque
2018 2018 IEEE/ACM 22nd International Symposium on Distributed Simulation and Real Time Applications (DS-RT)  
BigDataNetSim Architecture BigDataNetSim is a simulator designed to execute analysis of data placement, task placement and network parameters of a Big Data cluster based on Hadoop (HDFS/YARN) and data  ...  Data distribution in HDFS The data distribution reports show how the blocks of a particular file are distributed among the cluster nodes.  ... 
doi:10.1109/distra.2018.8601018 dblp:conf/dsrt/AlmeidaA0GV18 fatcat:sdn5bombtvdbdnru4b4r7xcpl4

Big Data Analytics Technologies and Platforms: A Brief Review

Ticiana L. Coelho da Silva, Regis Pires Magalhães, Igo Ramalho Brilhante, José A. F. de Macêdo, David Araújo, Paulo A. L. Rego, Aloisio Vieira Lira Neto
2018 Very Large Data Bases Conference  
A plethora of Big Data Analytics technologies and platforms have been proposed in the last years. However, in 2017, only 53% of companies are adopting such tools.  ...  In this paper, we aim at helping organizations in the selection of technologies/platforms more appropriate to their analytic processes by offering a short-review according to some categories of Big Data  ...  Finally, Table 1 Big Data Platforms A Big Data platform is an ecosystem of services and technologies that needs to perform analysis on voluminous, complex and dynamic data.  ... 
dblp:conf/vldb/SilvaMBMARN18 fatcat:ny53uz6ixre7nnib3hga24hque

Text Mining with Apache Hadoop Over different Hadoop Clusters Architectures

2019 International journal of recent technology and engineering  
In this paper, Big Data allows Hadoop platform to boost the processing speed overlarge datasets through cluster architectures, which are studied and analyzed through text documents from newsgroup20 dataset.It  ...  Large number of documents are managed and maintained through popular leadingBig Data platform is Hadoop. It maintains all the information at Hadoop Distributed File System in Blocks.  ...  Following are some of the most favouring circumstances in Big Data through Hadoop clusters:  A Hadoop cluster absolutely performsparallel processing to help with the analysis,but clusters are lackingdue  ... 
doi:10.35940/ijrte.b1866.078219 fatcat:rjrjn5wr7becbk77osnc73shly

Simulations of Hadoop/MapReduce-Based Platform to Support its Usability of Big Data Analytics in Healthcare

Dillon Chrimes, Hamid Zamani, Belaid Moa, Alex Kuo
2018 Athens Journal of Τechnology & Engineering  
The study objective to establish an interactive Big Data Platform (BDA) was successful implemented in that Hadoop/MapReduce technologies formed the framework of the platform distributed with HBase (key-value  ...  Therefore, to implement, an existing High Performance Computing (HPC) Linux node clusters via WestGrid were used to represent a simulation of patient data benchmarked and cross-referenced with current  ...  The most impactful technology of the Big Data components in this study was MapReduce (and Java code performance therein).  ... 
doi:10.30958/ajte.5-3-1 fatcat:ms27xgx2prfbziyiwpnuhpn64m

Parallel processing on Big Data in the context of Machine Learning and Hadoop Ecosystem: A Survey

Anilkumar Vishwanath Brahmane1, R Murugan
2018 International Journal of Engineering & Technology  
As an effect, different categories of packages, distributions and technologies have been developed. In this paper an evaluation is done, this studies recent technologies developed for Big Data.  ...  On the other hand, in Big Data perspective, customary information methods and policies are not as much of capable.  ...  It provides Big Data connectors for high-performance and efficient connectivity. It includes also an open source oracle distribution of R to support advanced analysis.  ... 
doi:10.14419/ijet.v7i2.7.10885 fatcat:goyvvzlwsbeifi62nrldkgp3yy
« Previous Showing results 1 — 15 out of 2,905 results