1,070 Hits in 5.8 sec

LogM: Log Analysis for Multiple Components of Hadoop Platform

Yuxia Xie, Kai Yang, Pan Luo
2021 IEEE Access  
To facilitate the storage and processing of big data, a Hadoop platform typically runs on a cluster of servers and may scale up to process big data over thousands of hardware nodes.  ...  The Hadoop platform provides a powerful software framework for distributed storage and processing of massive amounts of data.  ...  In terms of design principle, the Hadoop platform deployed on a large number of hardware nodes that can provide computing and storage services locally.  ... 
doi:10.1109/access.2021.3076897 fatcat:g3xen2dhejb5niyxwepmuob3r4

Large Scale Audio-Visual Video Analytics Platform for Forensic Investigations of Terroristic Attacks [article]

Alexander Schindler, Martin Boyer, Andrew Lindley, David Schreiber, Thomas Philipp
2018 arXiv   pre-print
The platform integrates analytical modules for different input-modalities on a scalable architecture. Videos are analyzed according their acoustic and visual content.  ...  The heterogeneous results of the analytical modules are fused into a distributed index of visual and acoustic concepts to facilitate rapid start of investigations, following traits and investigating witness  ...  Audio Analysis Audio analysis is one of the key components of this platform.  ... 
arXiv:1811.11623v1 fatcat:k2uxr3djhrefpmtxt4wi376eyq

Monitoring WLCG with lambda-architecture: a new scalable data store and analytics platform for monitoring at petabyte scale

L Magnoni, U Suthakar, C Cordeiro, M Georgiou, J Andreeva, A Khan, D R Smith
2015 Journal of Physics, Conference Series  
Monitoring the WLCG infrastructure requires the gathering and analysis of a high volume of heterogeneous data (e.g. data transfers, job monitoring, site tests) coming from different services and experiment-specific  ...  Elasticsearch) and a description of a proof of concept implementation, based on Apache Spark and Esper, for the real-time part which compensates for batch-processing latency and automates problem detection  ...  and to concentrate analysis and transformation on the same framework with batch and real-time components.  ... 
doi:10.1088/1742-6596/664/5/052023 fatcat:kw5qzgcilvgsfd2wjenzqdov7e

Manufacturing process data analysis pipelines: a requirements analysis and survey

Ahmed Ismail, Hong-Linh Truong, Wolfgang Kastner
2019 Journal of Big Data  
Platform The purpose of this survey is to review platforms. We base our definition of a platform on [12] .  ...  This is followed by an analysis on the requirements for big data platforms (RQ1) and on the recent big data platforms for process data analysis (RQ2).  ... 
doi:10.1186/s40537-018-0162-3 fatcat:6tlovbsubzhqfagjjyimsythm4

Healthcare Driven by Big Data Analytics

Cheryl Ann Alexander, Lidong Wang
2018 American Journal of Engineering and Applied Sciences  
Pig (Pig Latin) Using textual language by Pig Latin in a large-scale analysis platform and producing a sequence of MapReduce programs on Hadoop cluster.  ...  Hadoop-YARN Can separate resource management and processing components. HDFS Hadoop Distributed File Systems; a Hadoop storage system, fast distributing data in several nodes on a cluster.  ... 
doi:10.3844/ajeassp.2018.1154.1163 fatcat:qn63grbu7vfd5kv7nsm5fwvqr4

A Conceptual Framework for using Big Data in Egyptian Agriculture

Sayed Ahmed, Amira S. Mahmoud, Eslam Farg, Amany M. Mohamed, Marwa S. Moustafa, Mohamed A. E. AbdelRahman, Hisham M. AbdelSalam, Sayed M. Arafat
2022 International Journal of Advanced Computer Science and Applications  
were described and characterized. 3) A Conceptual framework for Egyptian agriculture practice based on BD analytics was introduced. 4) Challenges and extensive recommendations have been provided, which  ...  agriculture sector in responding to two main questions: 1) Which technique, frameworks and data types were adopted. 2) Identification of the existing gap associated with the data sources, modeling, and analysis  ...  Apache Kafka is based on five main components: Producer, Topics, Consumer, Partitions, and Brokers. The Producer component is responsible for writing a topic for Kafka system.  ... 
doi:10.14569/ijacsa.2022.0130322 fatcat:i6exbyctdbgtndzqubfbqps7eq


Michael E. Fisk, Curtis L. Hash
2014 Proceedings of the Fourth International Workshop on Cloud Data and Platforms - CloudDP '14  
Our on SMP systems, clusters built for Hadoop, and this distributed cloud, show that FileMap outperforms more prevalent computing systems and models by factors between 2x (compared to Hadoop) and 14x (  ...  compute platforms, file systems, and software.  ...  We conducted tests on 4 platforms: a laptop, a 48-way SMP, a cluster built for Hadoop, and a geographically distributed heterogeneous cloud.  ... 
doi:10.1145/2592784.2592790 dblp:conf/eurosys/FiskH14 fatcat:gsgtsdfmrfc6jlkbc3r5f53csy

Parallel big data processing system for security monitoring in Internet of Things networks

Igor V. Kotenko, Igor Saenko, Alexey Kushnerevich
2017 Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications  
Given the limited computing capabilities of IoT networks, we propose the architecture of a big data distributed parallel processing system based on Hadoop and Spark software platforms.  ...  A comparative evaluation of the implemented systems on Hadoop and Spark platforms is conducted. 60 Parallel big data processing system Kotenko, Saenko, and Kushnerevich traditional methods and means of  ...  In [11] , a system based on Hadoop compatible software and the Java platform is considered. This system consists of user interaction, data analysis, and data storage components.  ... 
doi:10.22667/jowua.2017.12.31.060 dblp:journals/jowua/KotenkoSK17 fatcat:jxuz4kfxxrhonc4b4pbrqwtmxy

Confluences among Big Data, Finite Element Analysis and High Performance Computing

Lidong Wang, Guanghui Wang, Cheryl Ann Alexander
2015 American Journal of Engineering and Applied Sciences  
In a lot of applications, Finite Element Analysis (FEA) strongly relies on advanced computer technology and HPC. Big Data will play an important role in FEA and HPC.  ...  It has great impacts on scientific discoveries and value creation.  ...  Hadoop is a Java based framework and heterogeneous open source platform. Hadoop's primary modules are the Hadoop Distributed File System (HDFS) and MapReduce (MR).  ... 
doi:10.3844/ajeassp.2015.767.774 fatcat:g6mkfiexzngbjleh4icsz4ekwq

A Survey on Security Issues and the Existing Solutions in Big Data

Pooja Chaudhary, Virendra Kumar
2017 International Journal of Computer Applications  
Here I have describe only one aspect of big data; other attributes are volume, velocity, value, and veracity.  ...  Hadoop YARN A resource management platform responsible for managing computing resources in clusters and scheduling the user's application.  ...  Author used Fibonacci based steganography to hide some data at the time of communication, which follow principle that is, First, Image will be presented in Fibonacci and after that a secret bit will be  ... 
doi:10.5120/ijca2017913405 fatcat:4osdjn7xgnepvcixbuuoihnmzy

Heterogeneous Data and Big Data Analytics

Lidong Wang
2017 Automatic Control and Information Sciences  
Heterogeneity is one of major features of big data and heterogeneous data result in problems in data integration and Big Data analytics.  ...  Challenges of dealing with heterogeneous data and Big Data analytics are also discussed.  ...  Most data integration platforms use a primary integration model based on either relational or XML data types.  ... 
doi:10.12691/acis-3-1-3 fatcat:t3yzrk4r2bfornki34khobe4su

Big Data and Business Analytics: Trends, Platforms, Success Factors and Applications

Ajah, Nweke
2019 Big Data and Cognitive Computing  
The paper reviews and discusses, the recent trends, opportunities and pitfalls of big data and how it has enabled organizations to create successful business strategies and remain competitive, based on  ...  Therefore, data proliferation requires a rethinking of techniques for capturing, storing, and processing the data. This is the role big data has come to play.  ...  It is efficient for real-time analysis and distributed stream processing in Hadoop. Provide high-performance data operation with efficient fault tolerance mechanism based on a distributed snapshot.  ... 
doi:10.3390/bdcc3020032 fatcat:ad6fhyndkffnlm4basxkmfwbsq

Analysis and Evaluation of Techniques for Managing Unstructured and Semi-Structured Data in a MapReduce Platform

Dina Darwish
2017 International Journal Of Engineering And Computer Science  
MapReduce is one of the most popular platforms in which the dataflow is in the form of a directed acyclic graph of operators.  ...  In this paper, we develop the engineering principles and practices to manage unstructured and semi-structured data in a MapReduce platform.  ...  In the following sections, a detailed analysis and description of storage, query, and update and index principles in a MapReduce platform is going to be explained.  ... 
doi:10.18535/ijecs/v6i2.03 fatcat:qonsrnvrtng4fkzvqqluwcqft4

Big Data Knowledge Discovery Platforms: A 360 Degree Perspective

2019 International Journal of Engineering and Advanced Technology  
This study encompasses a comprehensive review of Big Data analytical platforms and frameworks with their comparative analysis.  ...  These platforms and architecture are giving a cutting edge to the Big Data Knowledge Discovery process by using Artificial Intelligence, Machine Learning and Expert systems.  ...  The prevailingrepresentative machine learning algorithms for data dimensionality reduction embrace"Principal Component Analysis (PCA)", "Linear Discriminant Analysis (LDA)", "Locally Linear Embedding(LLE  ... 
doi:10.35940/ijeat.b3901.129219 fatcat:2w7a5tkmsfah7oft4jacfbzdau

Industrial Big Data Analytics: Challenges, Methodologies, and Applications [article]

JunPing Wang, WenSheng Zhang, YouKang Shi, ShiHui Duan, Jin Liu
2018 arXiv   pre-print
These challenges for industrial big data analytics is real-time analysis and decision-making from massive heterogeneous data sources in manufacturing space.  ...  While manufacturers have been generating highly distributed data from various systems, devices and applications, a number of challenges in both data management and data analysis require new approaches  ...  This platform consists of the Hadoop kernel, Map/Reduce and Hadoop distributed file system (HDFS), as well as a number of related projects, including Apache Hive, Apache HBase, and so on.  ... 
arXiv:1807.01016v2 fatcat:wyvz2pxasjh3pm6t7ozow3hlki
« Previous Showing results 1 — 15 out of 1,070 results