Filters








277 Hits in 17.7 sec

Understanding Vertical Scalability of I/O Virtualization for MapReduce Workloads: Challenges and Opportunities [chapter]

Bogdan Nicolae
2014 Lecture Notes in Computer Science  
This paper aims to answer these questions in the context of I/O intensive MapReduce workloads: it analyzes and characterizes their behavior under different virtualization scenarios in order to propose  ...  One such important challenge relates to the limited scalability of I/O, a determining factor in the overall performance of big data applications.  ...  Fig. 3 . 3 Aggregated statistics for overall CPU and disk utilization with a variable number of VMs per node and virtual disks per VM We experiment with I/O intensive MapReduce workloads in several virtualization  ... 
doi:10.1007/978-3-642-54420-0_1 fatcat:t7isdcjlsvbnlkdn7qio6ta5ma

Reconfigurable Hardware Accelerators: Opportunities, Trends, and Challenges [article]

Chao Wang, Wenqi Lou, Lei Gong, Lihui Jin, Luchao Tan, Yahui Hu, Xi Li, Xuehai Zhou
2017 arXiv   pre-print
In this survey, we compare hot research issues and concern domains, furthermore, analyze and illuminate advantages, disadvantages, and challenges of reconfigurable accelerators.  ...  In the end, we prospect the development tendency of accelerator architectures in the future, hoping to provide a reference for computer architecture researchers.  ...  bring new challenges and opportunities to the computer industry.  ... 
arXiv:1712.04771v1 fatcat:3lxv45qb4zaqpagtn3eghrmroe

Phoenix rebirth: Scalable MapReduce on a large-scale shared-memory system

Richard M. Yoo, Anthony Romano, Christos Kozyrakis
2009 2009 IEEE International Symposium on Workload Characterization (IISWC)  
This work optimizes Phoenix, a MapReduce runtime for shared-memory multi-cores and multiprocessors, on a quad-chip, 32-core, 256-thread UltraSPARC T2+ system with NUMA characteristics.  ...  Nevertheless, implementing such runtime systems for large-scale, shared-memory systems can be challenging.  ...  We also thank Woongki Baek, David Lo, and Daniel Sanchez for their input on the initial version of the paper.  ... 
doi:10.1109/iiswc.2009.5306783 dblp:conf/iiswc/YooRK09 fatcat:sc6k7uaux5fovp4w5qrjeuumcy

Cloud computing in e-Science: research challenges and opportunities

Xiaoyu Yang, David Wallom, Simon Waddington, Jianwu Wang, Arif Shaon, Brian Matthews, Michael Wilson, Yike Guo, Li Guo, Jon D. Blower, Athanasios V. Vasilakos, Kecheng Liu (+1 others)
2014 Journal of Supercomputing  
While the emergence of Cloud computing as a new computing paradigm has provided new directions and opportunities for e-Science infrastructure development, it also presents some challenges.  ...  Our particular contributions include identifying associated research challenges and opportunities, presenting lessons learned, and describing our future vision for applying Cloud computing to e-Science  ...  Acknowledgements We thank the anonymous reviewers for their constructive and insightful suggestions.  ... 
doi:10.1007/s11227-014-1251-5 fatcat:ribdemcam5coxmwoloszbc54ne

Workload interleaving with performance guarantees in data centers

Feng Yan, Evgenia Smirni
2016 NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium  
In the era of global, large scale data centers residing in clouds, many applications and users share the same pool of resources for the purposes of reducing energy and operating costs, and of improving  ...  Finally, at the computing cluster level, we investigate popular computing frameworks for large-scale data intensive distributed processing, such as MapReduce and its Hadoop implementation.  ...  Each trace records information about a set of attributes for each I/O request.  ... 
doi:10.1109/noms.2016.7502934 dblp:conf/noms/YanS16 fatcat:4aboolltgvb5xntubowkrkfsvu

Reconstructing Hardware Transactional Memory for Workload Optimized Systems [chapter]

Kunal Korgaonkar, Prabhat Jain, Deepak Tomar, Kashyap Garimella, Veezhinathan Kamakoti
2011 Lecture Notes in Computer Science  
This creates grand challenges to architectural and system designs, as well as to methods of programming these systems, which form the core theme of APPT 2011.  ...  As an event that has taken place for 16 years, APPT aims at providing a high-quality program for all attendees. We accepted 13 papers out of 40 submissions, presenting an acceptance rate of 32.5%.  ...  By our study and analysis of X10 and its runtime, we eliminated some performance bottlenecks such as I/O processing. We also identified some remaining optimization opportunities.  ... 
doi:10.1007/978-3-642-24151-2_1 fatcat:32cx745cn5cfdm5sbeah6eyiey

Adaptive workload allocation in query processing in autonomous heterogeneous environments

Anastasios Gounaris, Jim Smith, Norman W. Paton, Rizos Sakellariou, Alvaro A. A. Fernandes, Paul Watson
2008 Distributed and parallel databases  
The increasing prevalence of networked storage and computational resources, along with middleware for managing resource access and sharing, raises the prospect that queries can be run over resources obtained  ...  To address this challenge, adap-Communicated by Ahmed K. Elmagarmid. A.  ...  This work was conducted while the first author was with the University of Manchester, UK.  ... 
doi:10.1007/s10619-008-7032-5 fatcat:4nmsslhw7ngk5k6iy4jobb3mxm

Data-Intensive Cloud Computing: Requirements, Expectations, Challenges, and Solutions

Jawwad Shamsi, Muhammad Ali Khojaye, Mohammad Ali Qasmi
2013 Journal of Grid Computing  
However, underlying this abstraction, there are stringent requirements and challenges to facilitate scalable and resourceful services through effective physical infrastructure, smart networking solutions  ...  Further, the rate at which this data is being generated induces extensive challenges of data storage, linking, and processing.  ...  It divides MapReduce workload in three categories based on their I/O and CPU load. For any new task, workload type is predicted by the MR-Predict framework. The task is then handled accordingly.  ... 
doi:10.1007/s10723-013-9255-6 fatcat:l27ga4kh7nhnjd6nb6n57autgq

On the inequality of the 3V's of Big Data Architectural Paradigms: A case for heterogeneity [article]

Todor Ivanov, Nikolaos Korfiatis, Roberto V. Zicari
2013 arXiv   pre-print
This paper contributes on the understanding of the Hadoop ecosystem from the perspective of different workloads and aims to help researchers and practitioners on the design of scalable platforms targeting  ...  The well-known 3V architectural paradigm for Big Data introduced by Laney (2011), provides a simplified framework for defining the architecture of a big data platform to be deployed in various scenarios  ...  In the case of Big Data platforms with changing workloads, it is difficult to meet the network and storage I/O guarantees.  ... 
arXiv:1311.0805v2 fatcat:yu7niwfs5fdx7nwe4vd7vtpkve

Graphalytics

Mihai Capotă, Tim Hegeman, Alexandru Iosup, Arnau Prat-Pérez, Orri Erling, Peter Boncz
2015 Proceedings of the GRADES'15 on - GRADES'15  
Although platform diversity is beneficial, it also makes it very challenging to select the best platform for an application domain or one of its important applications, and to design new and tune existing  ...  Continuing a long tradition of using benchmarking to address such challenges, in this work we present our vision for Graphalytics, a big data benchmark for graphprocessing platforms.  ...  On the other hand, even though the cluster is slower for smaller graphs, it provides better scalability when data size grows as the computations become I/O bound, thanks to the greater disk bandwidth provided  ... 
doi:10.1145/2764947.2764954 dblp:conf/sigmod/CapotaHIPEB14 fatcat:gcj3bb7cznc3lj3vqe6vfkn2by

DVM

Zhiqiang Ma, Zhonghua Sheng, Lin Gu, Liufei Wen, Gong Zhang
2012 Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments - VEE '12  
The DVM provides a simple yet scalable programming model and mitigates the scalability bottleneck of traditional distributed shared memory systems.  ...  On one physical host, the system overhead of DVM is comparable to that of traditional VMMs. On 16 physical hosts, the DVM runs 10 times faster than MapReduce/Hadoop and X10.  ...  We thank Yanling Zheng, Mengmeng Cheng, Chengqi Song and Yanqun Zhang for their help in various aspects of this project, and the Amazon AWS research grant for the support in the EC2 based evaluation.  ... 
doi:10.1145/2151024.2151032 dblp:conf/vee/MaSGWZ12 fatcat:xykdqu64cvfklco5tt5pycy7b4

DVM

Zhiqiang Ma, Zhonghua Sheng, Lin Gu, Liufei Wen, Gong Zhang
2012 SIGPLAN notices  
The DVM provides a simple yet scalable programming model and mitigates the scalability bottleneck of traditional distributed shared memory systems.  ...  On one physical host, the system overhead of DVM is comparable to that of traditional VMMs. On 16 physical hosts, the DVM runs 10 times faster than MapReduce/Hadoop and X10.  ...  We thank Yanling Zheng, Mengmeng Cheng, Chengqi Song and Yanqun Zhang for their help in various aspects of this project, and the Amazon AWS research grant for the support in the EC2 based evaluation.  ... 
doi:10.1145/2365864.2151032 fatcat:cipjvui4tfdsrmcqui7sxpjhei

A Survey of Big Data Machine Learning Applications Optimization in Cloud Data Centers and Networks [article]

Sanaa Hamid Mohamed, Taisir E.H. El-Gorashi, Jaafar M.H. Elmirghani
2019 arXiv   pre-print
The MapReduce programming model and its widely-used open-source platform; Hadoop, are enabling the development of a large number of cloud-based services and big data applications.  ...  as virtualization, and software-defined networking that increasingly support big data systems.  ...  All data are provided in full in the results section of this paper.  ... 
arXiv:1910.00731v1 fatcat:kvi3br4iwzg3bi7fifpgyly7m4

Computing infrastructure for big data processing

Ling Liu
2013 Frontiers of Computer Science  
In this article, we will give an overview of computing infrastructure for big data processing, focusing on architectural, storage and networking challenges of supporting big data analysis.  ...  With the push of big data, we are entering a new era of parallel computing driven by novel and ground breaking research innovation on elastic parallelism and scalability.  ...  ) network traffic for MapReduce workloads.  ... 
doi:10.1007/s11704-013-3900-x fatcat:dbhdg4b6r5a5jlbzcescjrvusy

Pilot-Abstraction: A Valid Abstraction for Data-Intensive Applications on HPC, Hadoop and Cloud Infrastructures? [article]

Andre Luckow, Pradeep Mantha, Shantenu Jha
2015 arXiv   pre-print
the separation of storage and compute are not optimal for I/O intensive workloads (e.g. for data preparation, transformation and SQL).  ...  While there are many powerful computational and analytical libraries available on HPC (e.g. for scalable linear algebra), they generally lack the usability and variety of analytical libraries found in  ...  It has however, some limitations for data-intensive, I/O-bound workloads that require a high sequential read/write performance.  ... 
arXiv:1501.05041v1 fatcat:eiu3inxk7bblrcoh7orjkrimjq
« Previous Showing results 1 — 15 out of 277 results