Filters








6,487 Hits in 3.0 sec

Tuple MapReduce: Beyond Classic MapReduce

Pedro Ferrera, Ivan de Prado, Eric Palacios, Jose Luis Fernandez-Marquez, Giovanna Di Marzo Serugendo
2012 2012 IEEE 12th International Conference on Data Mining  
This paper proposes Tuple MapReduce, a new foundational model extending MapReduce with the notion of tuples.  ...  Pangool eases the design and implementation of applications based on MapReduce and increases their flexibility, still maintaining Hadoop's performance.  ...  Hadoop is a programming model and software framework allowing to process data following MapReduce concepts. Many abstractions and tools have arisen on top of MapReduce.  ... 
doi:10.1109/icdm.2012.141 dblp:conf/icdm/FerreraPPFS12 fatcat:r2kacp2g2fdjrpixastngetjiu

SQL/MapReduce

Eric Friedman, Peter Pawlowski, John Cieslewicz
2009 Proceedings of the VLDB Endowment  
We present a new approach to implementing a UDF, which we call SQL/MapReduce (SQL/MR), that overcomes many of these limitations.  ...  We leverage ideas from the MapReduce programming paradigm to provide users with a straightforward API through which they can implement a UDF in the language of their choice.  ...  Acknowledgements We are thankful to the engineering team at Aster Data Systems without whom SQL/MR and this paper would not have been possible.  ... 
doi:10.14778/1687553.1687567 fatcat:wu3j7qybpvfzvpxctngumpkfsi

Tiled-MapReduce

Rong Chen, Haibo Chen
2013 ACM Transactions on Architecture and Code Optimization (TACO)  
This article argues that it is more efficient for MapReduce to iteratively process small chunks of data in turn than processing a large chunk of data at a time on shared memory multicore platforms.  ...  TMR partitions a large MapReduce job into a number of small subjobs and iteratively processes one subjob at a time with efficient use of resources; TMR finally merges the results of all subjobs for output  ...  -An analysis is given wherein iteratively processing small chunks of data is more efficient than processing a large chunk of data for MapReduce on multicore platforms.  ... 
doi:10.1145/2445572.2445575 fatcat:fbfbnro6rzegfb4y5vro3i4zva

MapReduce in GPI-Space [chapter]

Tiberiu Rotaru, Mirko Rahn, Franz-Josef Pfreundt
2014 Lecture Notes in Computer Science  
On the other hand, the growing demand for processing big data volumes requires a better control of the workflows, an efficient storage management, as well as a fault-tolerant runtime system.  ...  Trying to offer our proper solution to these problems, we designed and developed GPI-Space, a complex but flexible software development and execution platform, in which the data coordination of an application  ...  Compared to this, GPI-Space is more flexible, offering tools that facilitate the development of more complex workflows than MapReduce.  ... 
doi:10.1007/978-3-642-54420-0_5 fatcat:w6tnucgvnnbdxbufi3r7cmsryu

SNP genotype calling with MapReduce

Simone Leo, Luca Pireddu, Gianluigi Zanetti
2012 Proceedings of third international workshop on MapReduce and its Applications Date - MapReduce '12  
Here, we present a scalable MapReduce application that offers both greater scalability and flexibility than the current state-of-the-art.  ...  The software can process datasets as large as 7000 samples in a day, it is more than one order of magnitude faster than previous solutions, and it is currently used in production.  ...  ACKNOWLEDGMENTS We would like to thank Francesco Cucca and CNR-IRGB for kindly allowing us to perform scalability tests with their genotyping data.  ... 
doi:10.1145/2287016.2287026 fatcat:uzw2wpor7bdbtkzg6ew5ks5ngq

Document Selection Using Mapreduce

Yenumula B Reddy, Desmond Hill
2015 International Journal of Security Privacy and Trust Management  
The paper includes the analysis of Big Data using MapReduce techniques and identifying a required document from a stream of documents.  ...  The research discusses the MapReduce issues, framework for MapReduce programming model and implementation.  ...  The structured and unstructured data comes from a variety of sources. Adoption of big data tools to process the big data is increasing.  ... 
doi:10.5121/ijsptm.2015.4401 fatcat:dbo5nxkgtjelxdkqc2cjjkuwia

Building cubes with MapReduce

Alberto Abelló, Jaume Ferrarons, Oscar Romero
2011 Proceedings of the ACM 14th international workshop on Data Warehousing and OLAP - DOLAP '11  
Indeed, specific software tools to exploit a cloud are also here. The trend in this case is toward using tools based on the MapReduce paradigm developed by Google.  ...  In this paper, we explore the possibility of having data in a cloud by using BigTable to store the corporate historical data and MapReduce as an agile mechanism to deploy cubes in ad-hoc Data Marts.  ...  A MapReduce tool would help to parallelize the ETL process and choose a different cleaning solution depending on the specific need of the time.  ... 
doi:10.1145/2064676.2064680 dblp:conf/dolap/AbelloFR11 fatcat:cbjy5cdihffwjfpctnkjxi7xeq

Otus

Kai Ren, Julio López, Garth Gibson
2011 Proceedings of the second international workshop on MapReduce and its applications - MapReduce '11  
This approach is embodied in Otus, a monitoring tool to attribute resource usage to jobs and services in Hadoop clusters.  ...  Frameworks for large scale data-intensive applications, such as Hadoop and Dryad, have gained tremendous popularity.  ...  MapReduce task processes only run for a limited amount of time, and each MapReduce job has different numbers of task processes.  ... 
doi:10.1145/1996092.1996094 fatcat:44enp3wp4za2vhegqlevv2cc4u

Versatile XQuery Processing in MapReduce [chapter]

Caetano Sauer, Sebastian Bächle, Theo Härder
2013 Lecture Notes in Computer Science  
The MapReduce (MR) framework has become a standard tool for performing large batch computations-usually of aggregative nature-in parallel over a cluster of commodity machines.  ...  XQuery not only is an established query language, but also has a more expressive data model and more powerful language constructs, enabling a much greater degree of flexibility.  ...  BrackitMR is also a flexible tool for large-scale query processing, because it builds upon an existing single-core query processor, Brackit, adding MR as a distributed coordination layer.  ... 
doi:10.1007/978-3-642-40683-6_16 fatcat:rjlndrsjmfgfvm32wlyliejlca

Parallel data processing with MapReduce

Kyong-Ha Lee, Yoon-Joon Lee, Hyunsik Choi, Yon Dohn Chung, Bongki Moon
2012 SIGMOD record  
A prominent parallel data processing tool MapReduce is gaining significant momentum from both industry and academia as the volume of data to analyze grows rapidly.  ...  We also discuss the open issues and challenges raised on parallel data analysis with MapReduce.  ...  MapReduce, which has been popularized by Google, is a scalable and fault-tolerant data processing tool that enables to process a massive volume of data in parallel with many low-end computing nodes [44  ... 
doi:10.1145/2094114.2094118 fatcat:kuvfuwss3fcmbf2d7oqqfibmoq

MAPREDUCE RESEARCH ON WAREHOUSING OF BIG DATA

2018 International Journal of Recent Trends in Engineering and Research  
This paper provides an overview on research and attempts to incorporate MapReduce with data warehouse in order to empower it for handling of big data.  ...  is facing a paradigm shift.  ...  a hybrid data processing engine.  ... 
doi:10.23883/ijrter.2018.4170.raqfm fatcat:jdmegv5vrrdttp5a2dw53rydee

MapReduce and parallel DBMSs

Michael Stonebraker, Daniel Abadi, David J. DeWitt, Sam Madden, Erik Paulson, Andrew Pavlo, Alexander Rasin
2010 Communications of the ACM  
THE MAPREDUCE 7 (MR) PARADIGM has been hailed as a revolutionary new platform for large-scale, massively parallel data access. 16 Some proponents claim the extreme scalability of MR will relegate relational  ...  At least one enterprise, Facebook, has implemented a large data warehouse system using MR technology rather than a DBMS. 14 ere, we argue that using MR systems to perform tasks that are best suited for  ...  Mapping Parallel DBMSs onto MapReduce An attractive quality of the MR programming model is simplicity; an MR program consists of only two functions-Map and Reduce-written by a user to process key/value  ... 
doi:10.1145/1629175.1629197 fatcat:z5pgtg4iizcrhbv23qt37yri5i

A hybrid mapreduce model for prolog

Joana Corte-Real, Ines Dutra, Ricardo Rocha
2014 2014 International Symposium on Integrated Circuits (ISIC)  
State-of-the-art systems such as Google's, Hadoop or SAGA often provide added features like a distributed file system, fault tolerance mechanisms, data redundancy and portability to the basic MapReduce  ...  MapReduce for Prolog addresses efficiency issues by performing load balancing on data with different granularity and allowing for parallelization in shared memory, as well as across machines.  ...  As such, a hybrid MapReduce construct would be a valuable tool to make this process simpler and more efficient. II.  ... 
doi:10.1109/isicir.2014.7029555 dblp:conf/isicir/Corte-RealDR14 fatcat:en6y73q445ahdgiv77gbhecray

Static type checking of Hadoop MapReduce programs

Jens Dörre, Sven Apel, Christian Lengauer
2011 Proceedings of the second international workshop on MapReduce and its applications - MapReduce '11  
MapReduce is a programming model for the development of Webscale programs.  ...  For example, in Hadoop, the connection between the two phases of a MapReduce computation is unsafe: there is no static type check of the generic type parameters involved.  ...  MAPREDUCE Technically, a MapReduce system is a framework for processing data in chunks.  ... 
doi:10.1145/1996092.1996096 fatcat:5m5wiq2a2bgvfmapvnnrcb2oce

Assessing MapReduce for Internet Computing: A Comparison of Hadoop and BitDew-MapReduce

Lu Lu, Hai Jin, Xuanhua Shi, Gilles Fedak
2012 2012 ACM/IEEE 13th International Conference on Grid Computing  
MapReduce is emerging as an important programming model for data-intensive application.  ...  Adapting this model to desktop grid would allow taking advantage of the vast amount of computing power and distributed storage to execute new range of application able to process enormous amount of data  ...  They can use the BitDew API (or the command tool) to upload input data to workers and the MapReduce API to build their applications.  ... 
doi:10.1109/grid.2012.31 dblp:conf/grid/LuJSF12 fatcat:sq3kgjnsfned5hsmrgo3ips7za
« Previous Showing results 1 — 15 out of 6,487 results