Filters








42,156 Hits in 5.7 sec

Fast and Effective Root cause Analysis of Streaming Data using In-Memory Processing Techniques

S. Naveen Kumar, S. Vijayaragavan
2017 Indian Journal of Science and Technology  
Methods/Analysis: In-order to identify the best processing architecture for root-cause analysis, the existing architectures are divided in terms of sequential processing using python, CPU based parallelization  ...  Pre-processing the input text was identified to be the most process intensive component of any text based processing framework.  ...  Analysis (RCA) from text is chosen as the field of operation, as it requires faster processing of huge amount of data.  ... 
doi:10.17485/ijst/2017/v10i38/114003 fatcat:5nvppkn4q5bgzg2royrjfno2xu

[HUGE]: Universal Architecture for Statistically Based HUman GEsturing [chapter]

Karlo Smid, Goranka Zoric, Igor S. Pandzic
2006 Lecture Notes in Computer Science  
We introduce a universal architecture for statistically based HUman GEsturing (HUGE) system, for producing and using statistical models for facial gestures based on any kind of inducement.  ...  As inducement we consider any kind of signal that occurs in parallel to the production of gestures in human behaviour and that may have a statistical correlation with the occurrence of gestures, e.g. text  ...  In this section we explained the architecture of our proposed HUGE system. Processes, data and data flows between processes of two system phases are explained in detail.  ... 
doi:10.1007/11821830_21 fatcat:pltudbkb7zgepiwvyn6pf5plky

REAL-TIME ROOT CAUSE IDENTIFICATION ON STREAMING HETEROGENEOUS DATA USING SPARK

S. Charles Britto .
2016 International Journal of Research in Engineering and Technology  
This paper presents a parallelized root cause identification architecture that identifies a ranked list of root causes for the user's query.  ...  Root cause identification can provide huge breakthroughs in the process of business decision making.  ...  Data parallelism is the current requirement, as huge amount of data is involved in the process and the next level process can be completed only after completing the current phase.  ... 
doi:10.15623/ijret.2016.0512005 fatcat:6evu6ymu5bd27h3djh43pwogsm

A Methodological Survey on MapReduce for Identification of Duplicate Images

2016 International Journal of Science and Research (IJSR)  
MapReduce is simple and parallel computing techniques normally used for analyzing the huge data.  ...  Duplicate image identification for deduplication is a specialized data compression technique for eliminating duplicate copies of repeating data in storage.  ...  Figure : 1 . :1 Figure: Parallel stream processing architecture overview. In parallel partitioning systems, generally the typical split and then merge pattern is used.  ... 
doi:10.21275/v5i1.nov152632 fatcat:wifwrnts3jf4dpkmdcnsdpavtq

Big Data Processing Using Hadoop in Retail Domain

Goddilla Nagarjuna Reddy
2016 International Journal Of Engineering And Computer Science  
So as to deal with the tremendous volume of information, the proposed strategy will prepare the information in parallel as little pieces in dispersed bunches what's more, total every one of the information  ...  The proposed strategy is upgraded by utilizing the methods, for example, supposition investigation through regular dialect handling for parsing the information into tokens and emoticon based bunching.  ...  this paper we use MapReduce framework, That provides a parallel processing model and associated implementation to process huge amounts of data.  ... 
doi:10.18535/ijecs/v5i9.65 fatcat:ou7eyuodtfcx5hm4i2dfmsqf3m

Makespan Map Reduce Architecture for Efficient Memory Utilization

2019 International journal of recent technology and engineering  
Massive large amount of data which are spread across the large number of machines needs to be parallelized  ...  The various challenges in computing dataset is handling large dataset efficiently and providing large amount of datasets with ease. The comprehensive method is to enhance data analyzation techniques.  ...  INTRODUCTION Cloud computing provides the various computing efforts for both scientific and large intensive applications. It provides distributed architecture to process large amount of data.  ... 
doi:10.35940/ijrte.c4729.098319 fatcat:54iseuzsdnavlguwt2rd7n6ffy

A GPU-Based Accelerator for Chinese Word Segmentation [chapter]

Xiwu Gu, Ruixuan Li, Kunmei Wen, Bei Peng, Weijun Xiao
2012 Lecture Notes in Computer Science  
Due to the dramatic increase in the amount of Chinese literature in recent years, it becomes a big challenge for web search engines to analyze massive Chinese information in time.  ...  In this paper, we investigate a new approach to high-performance Chinese information processing. We propose a CPU-GPU collaboration model for Chinese word segmentation.  ...  We mainly evaluate the performance of this architecture in two aspects.  ... 
doi:10.1007/978-3-642-29253-8_20 fatcat:vkuognsnyrd75bn7s3soudxcve

Parallel Lexical Analysis of Multiple Files on Multi-Core Machines

Amit Barve, Brijendra Kumar Joshi
2014 International Journal of Computer Applications  
Parallel Compilation is one of the areas that still needs serious research work to fully exploit the inherent power of the architecture.  ...  The multi-core machines open new doors to achieve parallelism in single machine. This new architecture has influenced every field of computing.  ...  In this paper we present an approach to parallelize lexical analysis phase for huge number of files.  ... 
doi:10.5120/16879-6879 fatcat:dajilu32pfgljhv6f3ltkcbipe

A Comprehensive Review of Tools & Techniques for Big Data Analytics

Amita Dhankhar, Maharshi Dayanand University, Rohtak-124001, India
2019 International Journal of Emerging Trends in Engineering Research  
The changing technology prospect is described by the phrase Big Data that resulted in a large amount of data, a greater variety of data sources, a continuous flow of data and multiple data formats.  ...  Then we provide the systematic structure that divides the Big Data system into five sections namely data generation, acquisition and storage, data processing, data querying, and data analytics.  ...  COMPARISON OF VARIOUSPROGRAMMING MODELS Name Advantages Disadvantages Map Reduce . Parallel in nature . Work very fast with both structured and unstructured data . Require minimal amount of memory .  ... 
doi:10.30534/ijeter/2019/257112019 fatcat:ntnjikgimfbe7b6u6vln3cxjvq

High Throughput Data-Compression for Cloud Storage [chapter]

Bogdan Nicolae
2010 Data Management in Grid and Peer-to-Peer Systems  
Introduction As the rate, scale and variety of data increases in complexity, the need for flexible applications that can crunch huge amounts of heterogeneous data (such as web pages, online transaction  ...  As data volumes processed by large-scale distributed dataintensive applications grow at high-speed, an increasing I/O pressure is put on the underlying storage service, which is responsible for data management  ...  action, INRIA, CNRS and RE-NATER and other contributing partners (see http://www.grid5000.fr/ for details).  ... 
doi:10.1007/978-3-642-15108-8_1 dblp:conf/globe/Nicolae10 fatcat:efastmye55g67hygdvxv4b64au

Research on Computing Efficiency of MapReduce in Big Data Environment

Tilei Gao, Ming Yang, Rong Jiang, Yu Li, Yao Yao, G. Lee
2019 ITM Web of Conferences  
On the basis of mastering the principle and framework of MapReduce programming, the time consumption of distributed computing framework MapReduce and traditional computing model is compared with concrete  ...  Based on the big data, this paper deeply studies the principle and framework of MapReduce programming.  ...  Acknowledgement The authors would like to thank the anonymous reviewers and the editors for their suggestions.  ... 
doi:10.1051/itmconf/20192603002 fatcat:z5rufyf2mnaeharfhbnv5t2iem

Bigdata Analysis: Streaming Twitter Data with Apache Hadoop and Visualizing using BigInsights

Manoj Kumar Danthala, Dr. Siddhartha Ghosh
2015 International Journal of Engineering Research and  
The amount of data in industries has been increasing and exploding to high rates-so-called big data. The use of big data will become a key basis of competition and growth for individual firms.  ...  So the simple and easy way to solve the problem of big data is with Hadoop which processes the big data in parallel of data intensive jobs on clusters of commodity servers.  ...  MapReduce Paradigm The Apache Hadoop's highest processing capabilities are based on MapReduce, a framework for performing highly parallelized processing of huge datasets, using a large clusters of nodes  ... 
doi:10.17577/ijertv4is050643 fatcat:itxrgxnow5e33pdmw35hc55wti

Assamese to English Statistical Machine Translation Integrated with a Transliteration Module

Pranjal Das, Kalyanee K. Baruah
2014 International Journal of Computer Applications  
It just requires parallel texts that are used in training the system [7] .  ...  A Transliteration model is also integrated into the system to deal with out of vocabulary (OOV) words. 21 train translation models for any language pair.  ...  Processing, Gauhati University for their immense support.  ... 
doi:10.5120/17522-8084 fatcat:i3ztdtkit5hgtaznptpnrmcciy

IMPROVING THE DATA TRANSMISSION SPEED IN CLOUD MIGRATION BY USING MAPREDUCE FOR BIGDATA

Naresh P, Rajyalakshmi P, Krishna Vempati, Saidulu D
2020 International Journal of Engineering Technology and Management Sciences  
At each cloud center huge amount of data was stored, which interns hard to store and retrieve information from it.  ...  This paper explores MapReduce within the distributed cloud architecture where MapReduce assists at each cloud. It strengthens the data migration process with the help of HDFS.  ...  it should be a secure, efficient, Today all organizations generate huge amount of data that will be transferred from to another cloud center and it could store by which end users can extract it for And  ... 
doi:10.46647/ijetms.2020.v04i05.013 fatcat:u4hrvoe7brfalc7k7mpjrcsswi

Samsung and University of Edinburgh's System for the IWSLT 2019

Joanna Wetesko, Marcin Chochowski, Pawel Przybysz, Philip Williams, Roman Grundkiewicz, Rico Sennrich, Barry Haddow, Antonio Valerio Miceli Barone, Alexandra Birch
2019 Zenodo  
Our submission was ultimately produced by combining four Transformer systems through a mixture of ensembling and reranking.  ...  This paper describes the joint submission to the IWSLT 2019 English to Czech task by Samsung R&D Institute, Poland, and the University of Edinburgh.  ...  Our final system is an ensemble of large transformer models trained with large amounts filtered parallel data and selected synthetic data.  ... 
doi:10.5281/zenodo.3525536 fatcat:srmw7plqmnen5ajt2cla4f2al4
« Previous Showing results 1 — 15 out of 42,156 results