8,735 Hits in 8.1 sec

Performance Improvement of DAG-Aware Task Scheduling Algorithms with Efficient Cache Management in Spark

Yao Zhao, Jian Dong, Hongwei Liu, Jin Wu, Yanxin Liu
2021 Electronics  
However, current DAG-aware task scheduling algorithms, among which HEFT and GRAPHENE are notable, pay little attention to the cache management policy, which plays a vital role in in-memory data-parallel  ...  Cache management policies that are designed for Spark exhibit poor performance in DAG-aware task-scheduling algorithms, which leads to cache misses and performance degradation.  ...  Introduction Spark is an in-memory data analytics framework that is used extensively in iterative data processing with low latency [1] [2] [3] [4] .  ... 
doi:10.3390/electronics10161874 fatcat:6flphen445b4pb4uvfpoqqqjoi

Pilot-Abstraction: A Valid Abstraction for Data-Intensive Applications on HPC, Hadoop and Cloud Infrastructures? [article]

Andre Luckow, Pradeep Mantha, Shantenu Jha
2015 arXiv   pre-print
As memory naturally fits in with the Pilot concept of retaining resources for a set of tasks, we propose the extension of the Pilot-Abstraction to in-memory resources.  ...  Further, in-memory capabilities have been deployed to enhance the performance of large-scale data analytics (e.g. iterative algorithms) for which the ability to re-use data across iterations is critical  ...  The Pilot-Agent will stage-in and out data via the data manager. The Distributed Memory Manager handles the caching of data required for the computation. CUs are executed via the Compute Manager.  ... 
arXiv:1501.05041v1 fatcat:eiu3inxk7bblrcoh7orjkrimjq

Author index

2006 2006 IEEE International Conference on Cluster Computing  
Caches for Transient Data Maccabe, Arthur B.  ...  Tong, Yizhu Application-aware Interface for SOAP Communication in Web Services Uhlemann, Kai JOSHUA: Symmetric Active/Active Replication for Highly Available HPC Job and Resource Management Underwood  ... 
doi:10.1109/clustr.2006.311921 fatcat:vmbbimypuze7ncjqfonu4po5l4

Secure and Optimized Cloud-Based Cyber-Physical Systems with Memory-Aware Scheduling Scheme

Dr. Wang Haoxiang, Dr. S. Smys
2020 Journal of Trends in Computer Science and Smart Technology  
For this purpose, in cloud environment, virtual machines (VMs) are used for hosting the applications and the resources are managed thereby optimizing the energy consumption.  ...  However, despite the wide application and deployment of CPS in combining the key technologies like big data analytics, cloud computing and IoT, its energy consumption is large.  ...  The global and local phases are used by cache-aware scheduler for scheduling the behavior of cache in all nodes or specific nodes.  ... 
doi:10.36548/jtcsst.2020.3.003 fatcat:adui6oubajbq5hfgxpsutpeei4

A Community Cache with Complete Information

Mania Abdi, Amin Mosayyebzadeh, Mohammad Hossein Hajkazemi, Emine Ugur Kaynar, Ata Turk, Larry Rudolph, Orran Krieger, Peter Desnoyers
2021 USENIX Conference on File and Storage Technologies  
It integrates rich information from analytics platforms with global knowledge about demand and resource availability to enable sophisticated cache management and prefetching strategies that, for example  ...  Kariz is a new architecture for caching data from datalakes accessed, potentially concurrently, by multiple analytic platforms.  ...  Partial support for this  ... 
dblp:conf/fast/AbdiMHKTRKD21 fatcat:zv3megbxrrfd3ayxkbelbjxpfm

Shared Memory Based RDD Data Sharing on Spark

Hai-Hua WANG, Yi LIANG, Ying HOU, Mang-Mang YANG, Ming-Lu FAN
2016 DEStech Transactions on Engineering and Technology Research  
In this paper, we propose an extension of Apache Spark, called Shared Memory Spark (SMSpark). SMSpark introduces shared memory based on RDD data sharing between applications.  ...  Apache Spark is an increasingly popular fast big data analytics engine, which focuses on a large-scale data processing.  ...  So it is important to implement an efficient way of memory resource management in Spark.  ... 
doi:10.12783/dtetr/ssme-ist2016/3939 fatcat:m4swsyixrvg6bnhbjyeg3cxrf4

Hetero-DB: Next Generation High-Performance Database Systems by Best Utilizing Heterogeneous Computing and Storage Resources

Kai Zhang, Feng Chen, Xiaoning Ding, Yin Huai, Rubao Lee, Tian Luo, Kaibo Wang, Yuan Yuan, Xiaodong Zhang
2015 Journal of Computer Science and Technology  
Hetero-DB develops a GPU-aware query execution engine with GPU device memory management and query scheduling mechanism to support concurrent query execution.  ...  GPU may offer an order of higher throughput for applications with massive data parallelism, compared with the multicore CPU.  ...  for database resource management.  ... 
doi:10.1007/s11390-015-1553-y fatcat:tv5xc6p4avhebfkhmzj5emcl2y

Scheduling Data-Intensive Tasks on Heterogeneous Many Cores

Pinar Tözün, Helena Kotthaus
2019 IEEE Data Engineering Bulletin  
The increasing levels of parallelism and heterogeneity in emerging server hardware amplify these challenges in addition to the increasing variety of data-intensive applications.  ...  This requires being aware of the micro-architectural features of processors, the hardware topology connecting the processing units of a server, and the characteristics of these units as well as the data-intensive  ...  The authors would like to thank Jens Teubner, Philippe Bonnet, and Danica Porobic for providing valuable feedback.  ... 
dblp:journals/debu/TozunK19 fatcat:ej47kfvucfhwpgyhmpvnxc3lqq

2020 Index IEEE Transactions on Parallel and Distributed Systems Vol. 31

2021 IEEE Transactions on Parallel and Distributed Systems  
., +, TPDS May 2019 1091-1104 Hierarchical Hybrid Memory Management in OS for Tiered Memory Sys- tems.  ...  ., +, TPDS May 2019 1052-1064 Efficient Data Placement and Replication for QoS-Aware Approximate Query Evaluation of Big Data Analytics.  ... 
doi:10.1109/tpds.2020.3033655 fatcat:cpeatdjlpzhqdersvsk5nmzjkm

Ubiquitous knowledge-based framework for RFID semantic discovery in smart u-Commerce environments

Michele Ruta, Floriano Scioscia, Tommaso Di Noia, Eugenio Di Sciascio, Giacomo Piscitelli
2009 Proceedings of the 11th International Conference on Electronic Commerce - ICEC '09  
This paper presents an extended framework to enable u-KBs in mobile scenarios where a semantic discovery is carried out using metadata stored in RFIDs without fixed repositories.  ...  Such a vision allows to build a truly pervasive environment where autonomous objects compose a self-organized evolving discovery architecture suitable for u-Commerce purposes.  ...  Initially, each reader will advertise, for each resource, the managed reference OUUID as well as context-aware parameters (i.e. the Time-To-Live (TTL) of the resource).  ... 
doi:10.1145/1593254.1593257 dblp:conf/ACMicec/RutaSNSP09 fatcat:crggcc2yd5bfhandsy44cztpj4

FlexIO: I/O Middleware for Location-Flexible Scientific Data Analytics

Fang Zheng, Hongbo Zou, Greg Eisenhauer, Karsten Schwan, Matthew Wolf, Jai Dayal, Tuan-Anh Nguyen, Jianting Cao, Hasan Abbasi, Scott Klasky, Norbert Podhorszki, Hongfeng Yu
2013 2013 IEEE 27th International Symposium on Parallel and Distributed Processing  
Since different placements have different impact on performance and cost, there is a consequent need for flexibility in the location of data analytics.  ...  The FlexIO middleware described in this paper makes it easy for scientists to obtain such flexibility, by offering simple abstractions and diverse data movement methods to couple simulation with analytics  ...  We also thank Ray Grout from National Renewable Energy Laboratory for his help on S3D application. This work was funded by Scientific Data Management Center, U.S.  ... 
doi:10.1109/ipdps.2013.46 dblp:conf/ipps/ZhengZESWDNCAKPY13 fatcat:uogj5f6yvfhbtcmvccrqjybhoe

Survey of Memory Management Techniques for HPC and Cloud Computing

Anna Pupykina, Giovanni Agosta
2019 IEEE Access  
INDEX TERMS Clouds, high performance computing, memory management, resource management.  ...  In this survey, challenges of memory management in HPC and Cloud Computing, different memory management systems and optimisation techniques to increase memory utilisation are discussed in detail.  ...  The reliability-aware data placement approach analyses memory pages for their hotness (e.g., how 'active' the page is currently in the cache) and vulnerability at runtime.  ... 
doi:10.1109/access.2019.2954169 fatcat:hwtpltrdrffqdjdofhr3shjkla

Answering Provenance-Aware Queries on RDF Data Cubes Under Memory Budgets [chapter]

Luis Galárraga, Kim Ahlstrøm, Katja Hose, Torben Bach Pedersen
2018 Lecture Notes in Computer Science  
The steadily-growing popularity of semantic data on the Web and the support for aggregation queries in SPARQL 1.1 have propelled the interest in Online Analytical Processing (OLAP) and data cubes in RDF  ...  We propose provenance-aware caching (PAC), a caching approach based on a provenance-aware partitioning of RDF graphs, and a benefit model for RDF cubes and SPARQL queries with aggregation.  ...  Acknowledgments This research was partially funded by the Danish Council for Independent Research (DFF) under grant agreement no. DFF-4093-00301.  ... 
doi:10.1007/978-3-030-00671-6_32 fatcat:rztguusjava4doukd736u2zhsi

2020-2021 Index IEEE Transactions on Computers Vol. 70

2021 IEEE transactions on computers  
The Author Index contains the primary entry for each item, listed under the first author's name.  ...  -that appeared in this periodical during 2021, and items from previous years that were commented upon or corrected in 2021.  ...  ., +, TC Aug. 2021 1199-1212 HAM: Hotspot-Aware Manager for Improving Communications With 3D-Stacked Memory.  ... 
doi:10.1109/tc.2021.3134810 fatcat:p5otlsapynbwvjmqogj47kv5qa

Survey of MapReduce on Big Data

Mr.A.Antony Prakash, Dr. A. Aloysius
2017 International Journal Of Engineering And Computer Science  
There is an observation about MapReduce framework. This framework generates large amount of intermediate data.  ...  Hadoop and MapReduce can be used for analyzing enormous amount of data. Hadoop is an open source software project used to processing a large data sets.  ...  LITRATURE REVIEW Before you Data Aware Caching for Big-Data Applications Using the MapReduce The author proposes a data aware cache framework for big data application called Dache.  ... 
doi:10.18535/ijecs/v6i3.49 fatcat:tiqxc6djjjbrtndc7qmaep3z7e
« Previous Showing results 1 — 15 out of 8,735 results