Filters








14,179 Hits in 7.7 sec

Big Data Mining Techniques

Adeel Shiraz Hashmi, Tanvir Ahmad
2016 Indian Journal of Science and Technology  
Big Data Tools Message Passing Interface 5 is a standard used for developing and running parallel applications on a peer-to-peer network. MPI is available for many programming languages.  ...  Compute Unified Device Architecture (CUDA) is a GPGPU framework for parallel computing on NVIDIA GPUs. The major drawback of GPGPU is the limited size of memory (maximum of 12GB as of now).  ... 
doi:10.17485/ijst/2016/v9i37/85826 fatcat:bx5lbuefwvbmdmb3cenmy3n3qa

Adaptive Neural Network Classifier-Based Analysis of Big Data in Health Care [chapter]

Manaswini Pradhan
2018 Data Mining  
Therefore, in this paper, a FCM based Map-Reduce programming model is proposed for the parallel computing using AANN approach.  ...  The FCM based Map-Reduce, clusters the large medical datasets into smaller groups of certain similarity and assigns each data cluster to one Mapper, where the training of neural networks are done by the  ...  In order to minimize the computational complexity and the memory requirement while leading large healthcare data, it is suggested to have a parallel adaptive artificial neural network (AANN) technique  ... 
doi:10.5772/intechopen.77225 fatcat:enag735qdzbill5gwsp3rlww3a

A parallel and distributed stochastic gradient descent implementation using commodity clusters

Robert K. L. Kennedy, Taghi M. Khoshgoftaar, Flavio Villanustre, Timothy Humphrey
2019 Journal of Big Data  
Additionally, neural networks benefit from training on Big Data, as typically more data produces more performant models [1] .  ...  In this paper, we present a novel distributed and parallel implementation of stochastic gradient descent (SGD) on a distributed cluster of commodity computers.  ...  Availability of data and materials Not applicable. Funding Not applicable.  ... 
doi:10.1186/s40537-019-0179-2 fatcat:qavklbgw3vd2fp4flnds3km34e

MapReduce Based Parallel Neural Networks in Enabling Large Scale Machine Learning

Yang Liu, Jie Yang, Yuan Huang, Lixiong Xu, Siguang Li, Man Qi
2015 Computational Intelligence and Neuroscience  
For this purpose, this paper parallelizes neural networks based on MapReduce, which has become a major computing model to facilitate data intensive applications.  ...  The performance of the parallelized neural networks is evaluated in an experimental MapReduce computer cluster from the aspects of accuracy in classification and efficiency in computation.  ...  [4] presented a parallel neural network using in-memory data processing techniques to speed up the computation of the neural network but without considering the accuracy aspect of the implemented parallel  ... 
doi:10.1155/2015/297672 pmid:26681933 pmcid:PMC4670636 fatcat:edbckdr3qrhe7dexxyj7iuusfu

Artificial neural networks based techniques for anomaly detection in Apache Spark

Ahmad Alnafessah, Giuliano Casale
2019 Cluster Computing  
Motivated by this issue, we propose an artificial neural network based methodology for anomaly detection tailored to the Apache Spark in-memory processing platform.  ...  Late detection and manual resolutions of performance anomalies in Cloud Computing and Big Data systems may lead to performance violations and financial penalties.  ...  This leads to a demand for performance anomaly detection in cloud computing and Big Data systems that are both dynamic and proactive in nature [21] .  ... 
doi:10.1007/s10586-019-02998-y fatcat:fgfp27k4xfbrpf2vv4qlmuovme

Software Abstractions for Large-Scale Deep Learning Models in Big Data Analytics

Ayaz H Khan, Ali Mustafa, Aneeq Yusuf, Rehanullah Khan
2019 International Journal of Advanced Computer Science and Applications  
In this paper, we present the latest research trends in the development of parallel algorithms, optimization techniques, tools and libraries related to big data analytics and deep learning on various parallel  ...  We obtained about 5-30% reduction in the execution time of the deep auto-encoder model even on a single node Hadoop cluster.  ...  INTRODUCTION Big volumes of data have been started to accumulate based on the advancements in sensor technology, the Internet, social networks, wireless communication, and inexpensive memory in various  ... 
doi:10.14569/ijacsa.2019.0100469 fatcat:h4nmna5mpfhk7mpwd2anyiuyma

Usages of Spark Framework with Different Machine Learning Algorithms

Mohamed Ali Mohamed, Ibrahim Mahmoud El-henawy, Ahmad Salah, Ahmed Mostafa Khalil
2021 Computational Intelligence and Neuroscience  
The Internet of Things is a term that describes the process of connecting computers, smart devices, and other data-generating equipment to a network and transmitting data.  ...  As a result, data is produced and updated on a regular basis to reflect changes in all areas and activities.  ...  potential for Big Data clustering [21] .  ... 
doi:10.1155/2021/1896953 fatcat:y3bkzwmtt5cfnmscww33qvydiu

Predictive Analytics On Big Data - An Overview

Gayathri Nagarajan, Dhinesh Babu L.D
2019 Informatica (Ljubljana, Tiskana izd.)  
While research works carried out continuously to handle big data is at one end, processing it to develop the business insights is a hot topic to work on the other end.  ...  The overview throws light on the core predictive models, challenges of these models on big data, research gaps in several domain sectors and using different techniques.  ...  Neural networks Neural network is a commonly used soft computing technique for predictive analytics.  ... 
doi:10.31449/inf.v43i4.2577 fatcat:hqi45o6t7jb63dr3aaesink6l4

Improved k-Means Clustering Algorithm for Big Data Based on Distributed SmartphoneNeural Engine Processor

Fouad H. Awad, Murtadha M. Hamad
2022 Electronics  
Running the k-means clustering in a distributed scheme run based on mobile machine learning efficiently can handle the big data clustering over the network.  ...  Clustering is one of the most significant applications in the big data field.  ...  In 2020, Reference [26] proposed a new solution to improve the k-means clustering for big data using the Hadoop parallel framework.  ... 
doi:10.3390/electronics11060883 doaj:ff31302813794dc5be0f3cddde1cf9f2 fatcat:omy4erer7vfufllss44gwzfuge

A Comprehensive Analysis of Proprietary and Open Source Data Mining Tools

Sonia Rani Chowdhary, Mr Vikash
2020 International Journal of Scientific Research in Computer Science Engineering and Information Technology  
This paper described the (a) various tools and techniques used by data mining applications. (b) compared features and limitations both in Proprietary and open sources data mining tools.  ...  The Powerful software tools and techniques required for the development of data mining applications.  ...  Social Network Analysis • Parallel Computing, Graphics • Visualization of geo spatial data • Web Application Big dataData and error handling • Requires Knowledge of array language • Less specialized  ... 
doi:10.32628/cseit206210 fatcat:463k5b47q5h6fkq4p6abeqx2oa

A MapReduce Cortical Algorithms Implementation for Unsupervised Learning of Big Data

Nadine Hajj, Yara Rizk, Mariette Awad
2015 Procedia Computer Science  
In this paper, we present a distributed cortical algorithm implementation for the unsupervised learning of big data based on a combined node-data parallelization scheme.  ...  A data sparsity measure is used to divide the data before distributing the columns in the network over many computing nodes based on the MapReduce framework.  ...  Next, a survey of neural network (NN) architectures for big data learning is presented in Section 2. Section 3 details the proposed clustering and distributed implementation.  ... 
doi:10.1016/j.procs.2015.07.310 fatcat:yhxhg2a5pneyxbkbhnrf3uc77u

Machine-Learning Based Memory Prediction Model for Data Parallel Workloads in Apache Spark

Rohyoung Myung, Sukyong Choi
2021 Symmetry  
Additionally, the whole building time for the proposed model requires a maximum of 44% of the total execution time of a data-parallel workload.  ...  In this paper, given the type of workload and volume of the input data, we analyze the memory usage pattern and derive the efficient memory size of data-parallel workloads in Apache Spark.  ...  The authors of [20] used a deep neural network to efficiently use computational resources (CPU) in a cloud environment.  ... 
doi:10.3390/sym13040697 fatcat:wg75rx55jjhyzef3pnqhxqlyfa

EGRNN++ and PNN++ : Parallel and Distributed Neural Networks for Big Data Regression and Classification

Sk Kamaruddin, Vadlamani Ravi
2021 SN Computer Science  
Probabilistic Neural Network (PNN) and General Regression Neural Network (GRNN) are unique as they are trained in one pass and perform well for classification and regression problems, respectively; however  ...  Therefore, this paper proposes hybrid architectures for PNN and GRNN, where the pattern layer is made simpler by storing cluster centers of all the samples, thereby making them amenable for big data analytics  ...  K-Means for prediction of data in a big data paradigm.  ... 
doi:10.1007/s42979-021-00504-z fatcat:qzcdd2qxurflpcu26sxts7vvzq

An effective classification approach for big data with parallel generalized Hebbian algorithm

Ahmed Hussein Ali, Royida A. Ibrahem Alhayali, Mostafa Abdulghafoor Mohammed, Tole Sutikno
2021 Bulletin of Electrical Engineering and Informatics  
This paper presents an efficient classification and reduction technique for big data based on parallel generalized Hebbian algorithm (GHA) which is one of the commonly used principal component analysis  ...  (PCA) neural network (NN) learning algorithms.  ...  Implementation on PCA neural network (NN) is another alternative for PCA implementation [55] .  ... 
doi:10.11591/eei.v10i6.3135 fatcat:27pqi7rhrjdybgrgzm7oaginbe

Data Mining, Machine Learning and Big Data Analytics

Lidong Wang
2017 International Transaction of Electrical and Computer Engineers System  
data, IT challenges, and Big Data in an extended service infrastructure.  ...  The feasibility and challenges of the applications of deep learning and traditional data mining and machine learning methods in Big Data analytics are also analyzed and presented.  ...  Neural Networks Neural networks, also called artificial neural networks, are models for classification and prediction [17] . Neural network algorithms are inherently parallel.  ... 
doi:10.12691/iteces-4-2-2 fatcat:bk3lvlmikjdqhfejqrrxjdq5eq
« Previous Showing results 1 — 15 out of 14,179 results