A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2018; you can also visit the original URL.
The file type is application/pdf
.
Filters
Fast and Effective Root cause Analysis of Streaming Data using In-Memory Processing Techniques
2017
Indian Journal of Science and Technology
Methods/Analysis: In-order to identify the best processing architecture for root-cause analysis, the existing architectures are divided in terms of sequential processing using python, CPU based parallelization ...
Pre-processing the input text was identified to be the most process intensive component of any text based processing framework. ...
Analysis (RCA) from text is chosen as the field of operation, as it requires faster processing of huge amount of data. ...
doi:10.17485/ijst/2017/v10i38/114003
fatcat:5nvppkn4q5bgzg2royrjfno2xu
[HUGE]: Universal Architecture for Statistically Based HUman GEsturing
[chapter]
2006
Lecture Notes in Computer Science
We introduce a universal architecture for statistically based HUman GEsturing (HUGE) system, for producing and using statistical models for facial gestures based on any kind of inducement. ...
As inducement we consider any kind of signal that occurs in parallel to the production of gestures in human behaviour and that may have a statistical correlation with the occurrence of gestures, e.g. text ...
In this section we explained the architecture of our proposed HUGE system. Processes, data and data flows between processes of two system phases are explained in detail. ...
doi:10.1007/11821830_21
fatcat:pltudbkb7zgepiwvyn6pf5plky
REAL-TIME ROOT CAUSE IDENTIFICATION ON STREAMING HETEROGENEOUS DATA USING SPARK
2016
International Journal of Research in Engineering and Technology
This paper presents a parallelized root cause identification architecture that identifies a ranked list of root causes for the user's query. ...
Root cause identification can provide huge breakthroughs in the process of business decision making. ...
Data parallelism is the current requirement, as huge amount of data is involved in the process and the next level process can be completed only after completing the current phase. ...
doi:10.15623/ijret.2016.0512005
fatcat:6evu6ymu5bd27h3djh43pwogsm
A Methodological Survey on MapReduce for Identification of Duplicate Images
2016
International Journal of Science and Research (IJSR)
MapReduce is simple and parallel computing techniques normally used for analyzing the huge data. ...
Duplicate image identification for deduplication is a specialized data compression technique for eliminating duplicate copies of repeating data in storage. ...
Figure : 1 . :1 Figure: Parallel stream processing architecture overview. In parallel partitioning systems, generally the typical split and then merge pattern is used. ...
doi:10.21275/v5i1.nov152632
fatcat:wifwrnts3jf4dpkmdcnsdpavtq
Big Data Processing Using Hadoop in Retail Domain
2016
International Journal Of Engineering And Computer Science
So as to deal with the tremendous volume of information, the proposed strategy will prepare the information in parallel as little pieces in dispersed bunches what's more, total every one of the information ...
The proposed strategy is upgraded by utilizing the methods, for example, supposition investigation through regular dialect handling for parsing the information into tokens and emoticon based bunching. ...
this paper we use MapReduce framework, That provides a parallel processing model and associated implementation to process huge amounts of data. ...
doi:10.18535/ijecs/v5i9.65
fatcat:ou7eyuodtfcx5hm4i2dfmsqf3m
Makespan Map Reduce Architecture for Efficient Memory Utilization
2019
International journal of recent technology and engineering
Massive large amount of data which are spread across the large number of machines needs to be parallelized ...
The various challenges in computing dataset is handling large dataset efficiently and providing large amount of datasets with ease. The comprehensive method is to enhance data analyzation techniques. ...
INTRODUCTION Cloud computing provides the various computing efforts for both scientific and large intensive applications. It provides distributed architecture to process large amount of data. ...
doi:10.35940/ijrte.c4729.098319
fatcat:54iseuzsdnavlguwt2rd7n6ffy
A GPU-Based Accelerator for Chinese Word Segmentation
[chapter]
2012
Lecture Notes in Computer Science
Due to the dramatic increase in the amount of Chinese literature in recent years, it becomes a big challenge for web search engines to analyze massive Chinese information in time. ...
In this paper, we investigate a new approach to high-performance Chinese information processing. We propose a CPU-GPU collaboration model for Chinese word segmentation. ...
We mainly evaluate the performance of this architecture in two aspects. ...
doi:10.1007/978-3-642-29253-8_20
fatcat:vkuognsnyrd75bn7s3soudxcve
Parallel Lexical Analysis of Multiple Files on Multi-Core Machines
2014
International Journal of Computer Applications
Parallel Compilation is one of the areas that still needs serious research work to fully exploit the inherent power of the architecture. ...
The multi-core machines open new doors to achieve parallelism in single machine. This new architecture has influenced every field of computing. ...
In this paper we present an approach to parallelize lexical analysis phase for huge number of files. ...
doi:10.5120/16879-6879
fatcat:dajilu32pfgljhv6f3ltkcbipe
A Comprehensive Review of Tools & Techniques for Big Data Analytics
2019
International Journal of Emerging Trends in Engineering Research
The changing technology prospect is described by the phrase Big Data that resulted in a large amount of data, a greater variety of data sources, a continuous flow of data and multiple data formats. ...
Then we provide the systematic structure that divides the Big Data system into five sections namely data generation, acquisition and storage, data processing, data querying, and data analytics. ...
COMPARISON OF VARIOUSPROGRAMMING MODELS Name Advantages Disadvantages Map Reduce . Parallel in nature . Work very fast with both structured and unstructured data . Require minimal amount of memory . ...
doi:10.30534/ijeter/2019/257112019
fatcat:ntnjikgimfbe7b6u6vln3cxjvq
High Throughput Data-Compression for Cloud Storage
[chapter]
2010
Data Management in Grid and Peer-to-Peer Systems
Introduction As the rate, scale and variety of data increases in complexity, the need for flexible applications that can crunch huge amounts of heterogeneous data (such as web pages, online transaction ...
As data volumes processed by large-scale distributed dataintensive applications grow at high-speed, an increasing I/O pressure is put on the underlying storage service, which is responsible for data management ...
action, INRIA, CNRS and RE-NATER and other contributing partners (see http://www.grid5000.fr/ for details). ...
doi:10.1007/978-3-642-15108-8_1
dblp:conf/globe/Nicolae10
fatcat:efastmye55g67hygdvxv4b64au
Research on Computing Efficiency of MapReduce in Big Data Environment
2019
ITM Web of Conferences
On the basis of mastering the principle and framework of MapReduce programming, the time consumption of distributed computing framework MapReduce and traditional computing model is compared with concrete ...
Based on the big data, this paper deeply studies the principle and framework of MapReduce programming. ...
Acknowledgement The authors would like to thank the anonymous reviewers and the editors for their suggestions. ...
doi:10.1051/itmconf/20192603002
fatcat:z5rufyf2mnaeharfhbnv5t2iem
Bigdata Analysis: Streaming Twitter Data with Apache Hadoop and Visualizing using BigInsights
2015
International Journal of Engineering Research and
The amount of data in industries has been increasing and exploding to high rates-so-called big data. The use of big data will become a key basis of competition and growth for individual firms. ...
So the simple and easy way to solve the problem of big data is with Hadoop which processes the big data in parallel of data intensive jobs on clusters of commodity servers. ...
MapReduce Paradigm The Apache Hadoop's highest processing capabilities are based on MapReduce, a framework for performing highly parallelized processing of huge datasets, using a large clusters of nodes ...
doi:10.17577/ijertv4is050643
fatcat:itxrgxnow5e33pdmw35hc55wti
Assamese to English Statistical Machine Translation Integrated with a Transliteration Module
2014
International Journal of Computer Applications
It just requires parallel texts that are used in training the system [7] . ...
A Transliteration model is also integrated into the system to deal with out of vocabulary (OOV) words. 21 train translation models for any language pair. ...
Processing, Gauhati University for their immense support. ...
doi:10.5120/17522-8084
fatcat:i3ztdtkit5hgtaznptpnrmcciy
IMPROVING THE DATA TRANSMISSION SPEED IN CLOUD MIGRATION BY USING MAPREDUCE FOR BIGDATA
2020
International Journal of Engineering Technology and Management Sciences
At each cloud center huge amount of data was stored, which interns hard to store and retrieve information from it. ...
This paper explores MapReduce within the distributed cloud architecture where MapReduce assists at each cloud. It strengthens the data migration process with the help of HDFS. ...
it should be a secure, efficient, Today all organizations generate huge amount of data that will be transferred from to another cloud center and it could store by which end users can extract it for And ...
doi:10.46647/ijetms.2020.v04i05.013
fatcat:u4hrvoe7brfalc7k7mpjrcsswi
Samsung and University of Edinburgh's System for the IWSLT 2019
2019
Zenodo
Our submission was ultimately produced by combining four Transformer systems through a mixture of ensembling and reranking. ...
This paper describes the joint submission to the IWSLT 2019 English to Czech task by Samsung R&D Institute, Poland, and the University of Edinburgh. ...
Our final system is an ensemble of large transformer models trained with large amounts filtered parallel data and selected synthetic data. ...
doi:10.5281/zenodo.3525536
fatcat:srmw7plqmnen5ajt2cla4f2al4
« Previous
Showing results 1 — 15 out of 42,156 results