Filters








4,518 Hits in 4.2 sec

Big Data Analysis with R Programming and RHadoop

U. Prathibha, M. Thillainayaki, A. Jenneth
2018 International Journal of Trend in Scientific Research and Development  
The paper focuses on extraction of data efficiently in big data tools using R programming techniques and how to manage the data and the components that are useful in handling big data.  ...  This paper proposes the big data applications with the Hadoop Distributed Framework for storing huge data in cloud in a highly efficient manner.  ...  BIG DATA AND HADOOP Big Data is a term used for a collection of data sets so large and complex that it is difficult to process using traditional applications/tools.  ... 
doi:10.31142/ijtsrd15705 fatcat:cx2zz6e7undc5dsrsvenmil774

Integrating R and Hadoop for Big Data Analysis [article]

Bogdan Oancea, Raluca Mariana Dragoescu
2014 arXiv   pre-print
One of the software tools successfully and wide spread used for storage and processing of big data sets on clusters of commodity hardware is Hadoop.  ...  Hadoop framework contains libraries, a distributed fi le-system (HDFS), a resource-management platform and implements a version of the MapReduce programming model for large scale data processing.  ...  In this paper we presented three ways of integrating R and Hadoop for processing large scale data sets: R and Streaming, Rhipe and RHadoop.  ... 
arXiv:1407.4908v1 fatcat:wbfuxpyzprg5lfjkeaipbfhnkm

Big Challenges? Big Data …

Sahil R., Aarati Mahajan
2015 International Journal of Computer Applications  
This document provides insights on the challenges of managing such a huge Datapopularly known as Big Data, the solutions offered by Big Data management tools/ techniques and the opportunities it has created  ...  This data generated through large customer transactions, social networking sites is varied, voluminous and rapidly generating. All this data prove a storage and processing crisis for the enterprises.  ...  TRADITIONAL DATA ANALTICS V/S BIG DATA ANALYTICS In Traditional Analytics, the analysis used to be done on the known data topography which was well understood.  ... 
doi:10.5120/ijca2015907452 fatcat:dgmzyowjyncvjjkepcdc4i4yey

RPig: Concise Programming Framework by Integrating R with Pig for Big Data Analytics [chapter]

MingXue Wang, Sidath Handurukande
2015 Cloud Computing with e-Science Applications  
To enable R to directly read/write data in these large scale data warehouses, interfaces between these warehouses and R are developed, such as Ricardo[18] that offers a bridge between R and Hadoop HDFS  ...  Background Big data [5] is data in volumes so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.  ... 
doi:10.1201/b18021-10 fatcat:inglp7sprnaqzdvroy5gdobjqa

A Review: Big Data Technologies with Hadoop Distributed Filesystem and Implementing M/R

Renas Rajab Asaad, Hawar B. Ahmad, Rasan Ismael Ali
2020 Academic Journal of Nawroz University  
Moreover, this paper focuses on discussing and understanding Big Data technologies and Analytics system with Hadoop distributed filesystem (HDFS).  ...  Today Big Data, is any set of data that is larger than the capacity to be processed using traditional database tools to capture, share, transfer, store, manage and analyze within an acceptable time frame  ...  Through "push" -a modern form of patriarchy -and on a large scale, governments are trying to direct citizens towards more harmonious behavior with the environment.  ... 
doi:10.25007/ajnu.v9n1a530 fatcat:75upvcitm5e4vm5knd4x52eaby

A REVIEW STUDY ON BIG DATA ANALYSIS USING R STUDIO

Savita, Neeraj Verma
2020 International Journal of Engineering Technologies and Management Research  
This paper primarily focuses on discussing the various technologies that work together as a Big Data Analytics system that can help predict future volumes, gain insights, take proactive actions, and give  ...  Big Data Analytics is a way of extracting value from these huge volumes of information, and it drives new market opportunities and maximizes customer retention.  ...  [9] This paper discusses some of the most commonly used big data technologies mostly open source that work together as a big data analytics system for leveraging large quantities of unstructured data  ... 
doi:10.29121/ijetmr.v6.i6.2019.402 fatcat:iykqbf6rdfanbprlo7avnfanfe

A Novel Big Data Approach to Classify Bank Customers - Solution by Combining PIG, R and Hadoop

Lija Mohan, Sudheep Elayidom M.
2016 International Journal of Information Technology and Computer Science  
Instead of relying on a single technology to process large scale data, we make use of a co mbination of strategies like Hadoop, PIG, R etc for efficient analysis.  ...  Extracting hidden patterns, customer preferences, market trends, unknown correlations, or any other useful business information fro m large collection of structured or unstructured data set is called Big  ...  Map Reduce [33] programming is followed in Hadoop. Hadoop is widely used in applicat ions where large scale data processing is necessary.  ... 
doi:10.5815/ijitcs.2016.09.10 fatcat:tucdibuabvf5bbhspqna6jlizq

Hands-on Big Data

Ryan Womack
2020 Zenodo  
The workshop will provide an overview of key technologies for the handling and analysis of large scale datasets, including Hadoop/MapReduce, the RHadoop package, other R packages used for large scale analysis  ...  Participants will work with a live demonstration environment that provides a realistic introduction to Big Data Analytics using scripts that will run both on a scaled-down demonstration dataset and on  ...  HBase is a large-scale data store modeled on Google's BigTable for storing sparse data. MongoDB is a NoSQL database designed for large-scale operation.  ... 
doi:10.5281/zenodo.3776988 fatcat:pdnu3ngynrfmncvk7big3qbyoi

Big Data [chapter]

Roberto Zicari
2013 Big Data Computing  
Akmal Chaudhri, Tom Fastner, Laura Haas, Alon Halevy, Volker Markl, Dave Thomas, Duncan Ross, Cindy Saracco, Justin Sheehy, Miguel-Angel Sicilia, Mike OSullivan, Steve Vinoski, for their feedback on  ...  by re-implementing Hadoop components. 43 Big Data "Dichotomy" • Analytics: MapReduce, Hadoop • Developers of very large scale user-facing Web sites implemented key-value stores -Google Big Table  ...  with multiple customers using Hadoop and many vendors innovating on top of Hadoop.  ... 
doi:10.1201/b16014-5 fatcat:dmmykohmnbhdfftho4sxl2uf2m

Big Data Analytics = Machine Learning + Cloud Computing [chapter]

C. Wu, R. Buyya, K. Ramamohanarao
2016 Big Data  
The objective of Hadoop is to leverage the commodity hardware for large scale of processing workload, which it used to be only possible to be accomplished by some expensive mainframe computers.  ...  That is, the execution of machine learning tasks on large-data sets in cloud computing environments is often called as Big Data analytics.  ...  Summary of Hadoop and its Ecosystems Hadoop has become the standard framework to run distributed BDA that can process massive scale of data on large clusters based on the commodity hardware or a cloud  ... 
doi:10.1016/b978-0-12-805394-2.00001-5 fatcat:2a2avnxwivbztmp7iksxqgkv2a

Big Data Analytics Integrating a Parallel Columnar DBMS and the R Language

Yiqun Zhang, Carlos Ordonez, Wellington Cabrera
2016 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)  
With that motivation in mind, we present COLUMNAR, a system integrating a parallel columnar DBMS and R, that can directly compute models on large data sets stored as relational tables.  ...  On the other hand, it is an order of magnitude faster than Spark (a prominent Hadoop system) and a traditional row-based DBMS.  ...  Big Data Analytics with SQL Queries and R We start by discussing storage in the columnar DBMS, where indexing is not required.  ... 
doi:10.1109/ccgrid.2016.94 dblp:conf/ccgrid/0001OC16 fatcat:qtdk36zro5bjxologekd6u24va

A survey: big data analytics on healthcare system

R. Sathiyavathi
2015 Contemporary Engineerng Sciences  
The gigantic size of analytics will need large computation which can be done with the help of distributed processing HADOOP.  ...  Big data is used to predict epidemics, cure disease, improve quality of life and avoid preventable deaths. with the increasing population of the world, and everyone living longer, and many of the decision  ...  Analytic Module With R  ... 
doi:10.12988/ces.2015.412255 fatcat:soef56wfpzabxkwpxgiazuitvi

RABID: A Distributed Parallel R for Large Datasets

Hao Lin, Shuo Yang, Samuel P. Midkiff
2014 2014 IEEE International Congress on Big Data  
This paper describes highly parallel R system called RABID (R Analytics for BIg Data) that maintains R compatibility, leverages the MapReducelike distributed Spark [22] and achieves high performance and  ...  R[5] is one of the most widely used of these languages, but is limited to a single threaded execution model and problem sizes that fit in a single node.  ...  CONCLUSIONS AND FUTURE WORK RABID provides R users with a familiar programming model that scales to large clusters, allowing larger problem sizes to be efficiently handled.  ... 
doi:10.1109/bigdata.congress.2014.107 dblp:conf/bigdata/LinYM14 fatcat:zncxc5ygkzfp5abmpc5aovoefq

BIG PROSPECTS AND PROBLEMS OF BIG DATA TECHNOLOGY
BİG DATA TEXNOLOGİYALARININ BÖYÜK PERSPEKTİVLƏRİ VƏ PROBLEMLƏRİ

Yadigar Imamverdiyev
2016 Problems of Information Society  
Big Data covers technologies and tools for collecting, processing, analyzing and extracting useful knowledge from structured and unstructured data of large volumes generated at high speed by different  ...  components and analytical capabilities of Big Data, and identifies advantages, prospects and existing problems.  ...  Hivedata storage infrastructure, used to request to large amounts of data located in Hadoop file system via SQL, and it fully supports MapReduce.  ... 
doi:10.25045/jpis.v07.i1.03 fatcat:mcecbsnqfva6tkizpfgs6aiyim

Infrastructure with r package for anomaly detection in real time big log data

Zirije Hasani
2017 Pressacademia  
Also we add the elastic-R client to the infrastructure we develop for big data analytic in order to detect anomalies.  ...  Also we present algorithms that are used for anomaly detection in big data. The algorithms are implemented in R language.  ...  Depending on the input stream of data we will experiment with scale up/out of the system components/servers and including other (batch appropriate) components as R software.  ... 
doi:10.17261/pressacademia.2017.588 fatcat:flpt5fpezza4npzilqrmgx7bxi
« Previous Showing results 1 — 15 out of 4,518 results