617 Hits in 6.2 sec

A comprehensive view of Hadoop research—A systematic literature review

Ivanilton Polato, Reginaldo Ré, Alfredo Goldman, Fabio Kon
2014 Journal of Network and Computer Applications  
Solution: We conducted a systematic literature review to assess research contributions to Apache Hadoop.  ...  datasets, but we were able to spot promising areas and suggest topics for future research within the framework.  ...  Table A1 and A2 Table A1 Studies with implementation and/or experiments (MapReduce and data storage & manipulation categories). Appendix A.  ... 
doi:10.1016/j.jnca.2014.07.022 fatcat:4xjveqy6mrctzjc4ou7llyy4u4

Scheduling parallel I/O operations

Ravi Jain, Kiran Somalwar, John Werth, J. C. Browne
1993 SIGARCH Computer Architecture News  
We describe a simple I/O scheduling problem and present approximate algorithms for its solution.  ...  We propose that within the context of such an integrated approach, scheduling parallel I/O operations will become increasingly attractive a n d can potentially provide substantial performance bene ts.  ...  Acknowledgements The rst author thanks Ramesh Govindan, Peter Newton and Mark Sullivan for many helpful comments and discussions.  ... 
doi:10.1145/165660.165670 fatcat:uobg64qgffezpgyclreknub5ze

Storing and Handling Complex Content for Large-scale Data

Hong Li Xu, College of Information Science and Engineering, Shan Dong Agricultural University, Shandong Taian 271018, China, Hong Hua Jiang, Qiu Lan Wu, Yuan Yuan Wang
2018 Journal of Communications  
It offered storing and handling complex content for large-scale data and support the further research of big data technical challenges, support for the application of it.  ...  Abstract-At present, the increasing growth and pervasive development of mass data raise the challenge for big data storage.  ...  It supports both write and record additional operations, the write operation allows us write file randomly, and the record appending supports parallel operation more safe and reliable. b) HDFS has the  ... 
doi:10.12720/jcm.13.12.763-768 fatcat:ts6qm2r5mvattd7gdstgmrz7o4

A Comprehensive Study of HBase Storage Architecture—A Systematic Literature Review

Muhammad Umair Hassan, Irfan Yaqoob, Sidra Zulfiqar, Ibrahim A. Hameed
2021 Symmetry  
We perform a systematic literature review on a number of published works proposed for HBase storage architecture.  ...  This paper seeks to define, taxonomically classify, and systematically compare existing research on a broad range of storage technologies, methods, and data models based on HBase storage architecture's  ...  Acknowledgments: We are thankful to the anonymous reviewers for their valuable comments and suggestions in improving our manuscript.  ... 
doi:10.3390/sym13010109 fatcat:6jqnicyw55fgrdjxrjxsl3bozm

Next-Generation Big Data Federation Access Control: A Reference Model [article]

Feras M. Awaysheh, Mamoun Alazab, Maanak Gupta, Tomás F. Pena, José C. Cabaleiro
2019 arXiv   pre-print
The efficiency of the proposed access broker has not sustainably affected the performance overhead. The experimental results show only 1\% of each 100 MB read/write operation in a WebHDFS.  ...  Overall, the findings of the paper pave the way for a wide range of revolutionary and state-of-the-art enhancements and future trends within Hadoop stack security and privacy.  ...  The NN receives the client call and allows it to reach data files which are stored in local disks via the DN pool. HDFS applications need a write-once-read-many access model for files.  ... 
arXiv:1912.11588v1 fatcat:3cot2oog6jedbilw2it3w7m47i


Hoang Bui, Peter Bui, Patrick Flynn, Douglas Thain
2010 Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing - HPDC '10  
As scientific research becomes more data intensive, there is an increasing need for scalable, reliable, and high performance storage systems.  ...  ROARS is a hybrid approach to distributed storage that provides both large, robust, scalable storage and efficient rich metadata queries for scientific applications.  ...  ACKNOWLEDGEMENTS This work was supported by National Science Foundation grants CCF-06-21434, CNS-06-43229, and CNS-01-30839.  ... 
doi:10.1145/1851476.1851587 dblp:conf/hpdc/BuiBFT10 fatcat:orwx2xon5fhktfuntf4efngzxi

Significance Of Big Data Frameworks And Speculative Approaches In Healthcare Systems

G.P. Hegde, Nagaratna Hegde
2021 International journal of advanced networking and applications  
This would be helpful for researchers to analyse and evaluate the characteristics of frameworks with respect to network throughput and latency.  ...  Selection of nodes for different stages of healthcare is also a challenging issue while selecting data frameworks.  ...  Dynamic processing and big data analysis process of healthcare data has been carried out in a systematic way by big data analytics frameworks [2] .  ... 
doi:10.35444/ijana.2021.12609 fatcat:z6hamit2lvgqvfpvlgydx46tci

Cooperation of Simulation and Data Model for Performance Analysis of Complex Systems

B. S. Kim, T. G. Kim
2019 International Journal of Simulation Modelling  
This paper identifies the characteristics of each modelling method and presents a cooperative model development process for performance analysis of complex systems.  ...  Before such a performance analysis, a model for prediction should be constructed. There are two types of models: data model and simulation model.  ...  The application describes the Hadoop programs such as WordCount and TeraSort. The disk I/O describes a storage model for file write, read, and shuffle.  ... 
doi:10.2507/ijsimm18(4)491 fatcat:uzsgri7yujc4lpiixqmxcdkwba

Big data storage technologies: a survey

Aisha Siddiqa, Ahmad Karim, Abdullah Gani
2017 Frontiers of Information Technology & Electronic Engineering  
There is a great thrust in industry toward the development of more feasible and viable tools for storing fast-growing volume, velocity, and diversity of data, termed 'big data'.  ...  big data, investigating the performance and magnitude gains of these technologies.  ...  It is deployed on top of Hadoop and HDFS and facilitates efficient random read/write operations.  ... 
doi:10.1631/fitee.1500441 fatcat:hwc744xqyfevhk6x44djmxtp2q

An Auxiliary Decision-Making System for Electric Power Intelligent Customer Service Based on Hadoop

Shisong Wu, Zhaojie Dong, Rahman Ali
2022 Scientific Programming  
By analyzing the Hadoop big data framework, according to the characteristics and core elements of the HDFS distributed file system, the MapReduce programming model, and the data mining algorithm, the basic  ...  Aiming at the problems of low security, high occupancy rate, and long response time in the current power intelligent customer service assistant decision-making system, a power intelligent customer service  ...  When the block reads and writes, it is equivalent to the case where the Name Node sends a command and the Data Node performs the actual operation. (4) Secondary Name Node : it is mainly used to assist  ... 
doi:10.1155/2022/5165718 fatcat:7qm6dpk7h5alfcdmgtxfmi56iq

A structured modeling technology

Marek Makowski
2005 European Journal of Operational Research  
The modeling process is then characterized, and the requirement analysis for implementation of structured modeling is specified.  ...  The paper starts with a summary of the context of modeling composed of: the role of models in decision-making support; modeling paradigms; and state-of-the-art aspects of modeling complex problems.  ...  and a corresponding set of data), and analysis of results (with type-specific views on various data).  ... 
doi:10.1016/j.ejor.2004.03.037 fatcat:2boro6pomjetbdnrpjhwzexcxi

Evaluating the Open Source Data Containers for Handling Big Geospatial Raster Data

Fei Hu, Mengchao Xu, Jingchao Yang, Yanshou Liang, Kejin Cui, Michael M. Little, Christopher S. Lynnes, Daniel Q. Duffy, Chaowei Yang
2018 ISPRS International Journal of Geo-Information  
data model, and data operations); and (b) practical use experience and performance (data preprocessing, data uploading, query speed, and resource consumption).  ...  The runtime and computing resources (e.g., CPU, memory, hard drive, and network) consumption are assessed for their performance evaluation and analysis.  ...  Acknowledgments: This project is funded by NASA AIST (NNX15AM85G) and NSF (IIP-1338925 and ICER-1540998). We thank the anonymous reviewers for their insightful comments and reviews.  ... 
doi:10.3390/ijgi7040144 fatcat:csbbnucfbzd2za4ghkqnyclihm

Big Data Processing Platform on Intelligent Transportation Systems

Saida EL MENDILI, 10.30534/ijatcse/2019/16842019
2019 International Journal of Advanced Trends in Computer Science and Engineering  
In order to overcome this problem, it is essential to create a Big Data modeling approach for ITS, which pays particular attention to the creation of multiple layers.  ...  In fact, we will propose a Big Data processing design applied to Intelligent Transportation Systems. We will adopt a data modeling approach that treats both the transmission and the processing data.  ...  Real-time processing reads and writes data to different systems, including those that generate and use a constant data flow.  ... 
doi:10.30534/ijatcse/2019/16842019 fatcat:e62kibohbzclpa4cvr6nvwqfbq

Hybrid parallelization strategies for large-scale machine learning in SystemML

Matthias Boehm, Shirish Tatikonda, Berthold Reinwald, Prithviraj Sen, Yuanyuan Tian, Douglas R. Burdick, Shivakumar Vaithyanathan
2014 Proceedings of the VLDB Endowment  
In this paper, we present a systematic approach for combining task and data parallelism for large-scale machine learning on top of MapReduce.  ...  We employ a generic Parallel FOR construct (ParFOR) as known from high performance computing (HPC).  ...  For high performance of partitioning and read, we also support block-wise partitioning (groups of rows or columns) with a block size close to the HDFS block size.  ... 
doi:10.14778/2732286.2732292 fatcat:2mqx7oufxjf27dpr4v4cfdyolm

Performance Benefits of DataMPI: A Case Study with BigDataBench [chapter]

Fan Liang, Chen Feng, Xiaoyi Lu, Zhiwei Xu
2014 Lecture Notes in Computer Science  
In this paper, we use BigDataBench, a Big Data benchmark suite, to do comprehensive studies on performance and resource utilization characterizations of Hadoop, Spark and DataMPI.  ...  On the other hand, high-performance data analysis requirements are causing academical and industrial communities to adopt state-of-the-art technologies in HPC to solve Big Data problems.  ...  Lei Wang and Zijian Ming for their help to support this research, and also to the anonymous reviewers.  ... 
doi:10.1007/978-3-319-13021-7_9 fatcat:qaayps6s5bftvesjlcpmv5f2ri
« Previous Showing results 1 — 15 out of 617 results