A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2016; you can also visit the original URL.
The file type is application/pdf
.
Filters
A comprehensive view of Hadoop research—A systematic literature review
2014
Journal of Network and Computer Applications
Solution: We conducted a systematic literature review to assess research contributions to Apache Hadoop. ...
datasets, but we were able to spot promising areas and suggest topics for future research within the framework. ...
Table A1 and A2 Table A1 Studies with implementation and/or experiments (MapReduce and data storage & manipulation categories).
Appendix A. ...
doi:10.1016/j.jnca.2014.07.022
fatcat:4xjveqy6mrctzjc4ou7llyy4u4
Scheduling parallel I/O operations
1993
SIGARCH Computer Architecture News
We describe a simple I/O scheduling problem and present approximate algorithms for its solution. ...
We propose that within the context of such an integrated approach, scheduling parallel I/O operations will become increasingly attractive a n d can potentially provide substantial performance bene ts. ...
Acknowledgements The rst author thanks Ramesh Govindan, Peter Newton and Mark Sullivan for many helpful comments and discussions. ...
doi:10.1145/165660.165670
fatcat:uobg64qgffezpgyclreknub5ze
Storing and Handling Complex Content for Large-scale Data
2018
Journal of Communications
It offered storing and handling complex content for large-scale data and support the further research of big data technical challenges, support for the application of it. ...
Abstract-At present, the increasing growth and pervasive development of mass data raise the challenge for big data storage. ...
It supports both write and record additional operations, the write operation allows us write file randomly, and the record appending supports parallel operation more safe and reliable. b) HDFS has the ...
doi:10.12720/jcm.13.12.763-768
fatcat:ts6qm2r5mvattd7gdstgmrz7o4
A Comprehensive Study of HBase Storage Architecture—A Systematic Literature Review
2021
Symmetry
We perform a systematic literature review on a number of published works proposed for HBase storage architecture. ...
This paper seeks to define, taxonomically classify, and systematically compare existing research on a broad range of storage technologies, methods, and data models based on HBase storage architecture's ...
Acknowledgments: We are thankful to the anonymous reviewers for their valuable comments and suggestions in improving our manuscript. ...
doi:10.3390/sym13010109
fatcat:6jqnicyw55fgrdjxrjxsl3bozm
Next-Generation Big Data Federation Access Control: A Reference Model
[article]
2019
arXiv
pre-print
The efficiency of the proposed access broker has not sustainably affected the performance overhead. The experimental results show only 1\% of each 100 MB read/write operation in a WebHDFS. ...
Overall, the findings of the paper pave the way for a wide range of revolutionary and state-of-the-art enhancements and future trends within Hadoop stack security and privacy. ...
The NN receives the client call and allows it to reach data files which are stored in local disks via the DN pool. HDFS applications need a write-once-read-many access model for files. ...
arXiv:1912.11588v1
fatcat:3cot2oog6jedbilw2it3w7m47i
As scientific research becomes more data intensive, there is an increasing need for scalable, reliable, and high performance storage systems. ...
ROARS is a hybrid approach to distributed storage that provides both large, robust, scalable storage and efficient rich metadata queries for scientific applications. ...
ACKNOWLEDGEMENTS This work was supported by National Science Foundation grants CCF-06-21434, CNS-06-43229, and CNS-01-30839. ...
doi:10.1145/1851476.1851587
dblp:conf/hpdc/BuiBFT10
fatcat:orwx2xon5fhktfuntf4efngzxi
Significance Of Big Data Frameworks And Speculative Approaches In Healthcare Systems
2021
International journal of advanced networking and applications
This would be helpful for researchers to analyse and evaluate the characteristics of frameworks with respect to network throughput and latency. ...
Selection of nodes for different stages of healthcare is also a challenging issue while selecting data frameworks. ...
Dynamic processing and big data analysis process of healthcare data has been carried out in a systematic way by big data analytics frameworks [2] . ...
doi:10.35444/ijana.2021.12609
fatcat:z6hamit2lvgqvfpvlgydx46tci
Cooperation of Simulation and Data Model for Performance Analysis of Complex Systems
2019
International Journal of Simulation Modelling
This paper identifies the characteristics of each modelling method and presents a cooperative model development process for performance analysis of complex systems. ...
Before such a performance analysis, a model for prediction should be constructed. There are two types of models: data model and simulation model. ...
The application describes the Hadoop programs such as WordCount and TeraSort. The disk I/O describes a storage model for file write, read, and shuffle. ...
doi:10.2507/ijsimm18(4)491
fatcat:uzsgri7yujc4lpiixqmxcdkwba
Big data storage technologies: a survey
2017
Frontiers of Information Technology & Electronic Engineering
There is a great thrust in industry toward the development of more feasible and viable tools for storing fast-growing volume, velocity, and diversity of data, termed 'big data'. ...
big data, investigating the performance and magnitude gains of these technologies. ...
It is deployed on top of Hadoop and HDFS and facilitates efficient random read/write operations. ...
doi:10.1631/fitee.1500441
fatcat:hwc744xqyfevhk6x44djmxtp2q
An Auxiliary Decision-Making System for Electric Power Intelligent Customer Service Based on Hadoop
2022
Scientific Programming
By analyzing the Hadoop big data framework, according to the characteristics and core elements of the HDFS distributed file system, the MapReduce programming model, and the data mining algorithm, the basic ...
Aiming at the problems of low security, high occupancy rate, and long response time in the current power intelligent customer service assistant decision-making system, a power intelligent customer service ...
When the block reads and writes, it is equivalent to the case where the Name Node sends a command and the Data Node performs the actual operation. (4) Secondary Name Node : it is mainly used to assist ...
doi:10.1155/2022/5165718
fatcat:7qm6dpk7h5alfcdmgtxfmi56iq
A structured modeling technology
2005
European Journal of Operational Research
The modeling process is then characterized, and the requirement analysis for implementation of structured modeling is specified. ...
The paper starts with a summary of the context of modeling composed of: the role of models in decision-making support; modeling paradigms; and state-of-the-art aspects of modeling complex problems. ...
and a corresponding set of data), and analysis of results (with type-specific views on various data). ...
doi:10.1016/j.ejor.2004.03.037
fatcat:2boro6pomjetbdnrpjhwzexcxi
Evaluating the Open Source Data Containers for Handling Big Geospatial Raster Data
2018
ISPRS International Journal of Geo-Information
data model, and data operations); and (b) practical use experience and performance (data preprocessing, data uploading, query speed, and resource consumption). ...
The runtime and computing resources (e.g., CPU, memory, hard drive, and network) consumption are assessed for their performance evaluation and analysis. ...
Acknowledgments: This project is funded by NASA AIST (NNX15AM85G) and NSF (IIP-1338925 and ICER-1540998). We thank the anonymous reviewers for their insightful comments and reviews. ...
doi:10.3390/ijgi7040144
fatcat:csbbnucfbzd2za4ghkqnyclihm
Big Data Processing Platform on Intelligent Transportation Systems
2019
International Journal of Advanced Trends in Computer Science and Engineering
In order to overcome this problem, it is essential to create a Big Data modeling approach for ITS, which pays particular attention to the creation of multiple layers. ...
In fact, we will propose a Big Data processing design applied to Intelligent Transportation Systems. We will adopt a data modeling approach that treats both the transmission and the processing data. ...
Real-time processing reads and writes data to different systems, including those that generate and use a constant data flow. ...
doi:10.30534/ijatcse/2019/16842019
fatcat:e62kibohbzclpa4cvr6nvwqfbq
Hybrid parallelization strategies for large-scale machine learning in SystemML
2014
Proceedings of the VLDB Endowment
In this paper, we present a systematic approach for combining task and data parallelism for large-scale machine learning on top of MapReduce. ...
We employ a generic Parallel FOR construct (ParFOR) as known from high performance computing (HPC). ...
For high performance of partitioning and read, we also support block-wise partitioning (groups of rows or columns) with a block size close to the HDFS block size. ...
doi:10.14778/2732286.2732292
fatcat:2mqx7oufxjf27dpr4v4cfdyolm
Performance Benefits of DataMPI: A Case Study with BigDataBench
[chapter]
2014
Lecture Notes in Computer Science
In this paper, we use BigDataBench, a Big Data benchmark suite, to do comprehensive studies on performance and resource utilization characterizations of Hadoop, Spark and DataMPI. ...
On the other hand, high-performance data analysis requirements are causing academical and industrial communities to adopt state-of-the-art technologies in HPC to solve Big Data problems. ...
Lei Wang and Zijian Ming for their help to support this research, and also to the anonymous reviewers. ...
doi:10.1007/978-3-319-13021-7_9
fatcat:qaayps6s5bftvesjlcpmv5f2ri
« Previous
Showing results 1 — 15 out of 617 results