29,529 Hits in 5.1 sec

Simulation of Job Scheduling for Small Scale Clusters

Hassan Rajaei, Mohammad Dadfar, Pankaj Joshi
2006 Proceedings of the 2006 Winter Simulation Conference  
Despite growing popularity of small-scale clusters built out of off-the-shelf components, there has been little research on how these small-scale clusters behave under different scheduling policies.  ...  The simulation results indicate that time-sharing scheduler for clusters could exhibit superior performance over a batch policy.  ...  Even though it incurs a heavy context switch overhead, it may still serve as a viable scheduling alternative in small-scale clusters.  ... 
doi:10.1109/wsc.2006.323211 dblp:conf/wsc/RajaeiDJ06 fatcat:ppf4os6si5c7hobzvs67sgryg4

TracSim: Simulating and scheduling trapped power capacity to maximize machine room throughput

Ziming Zhang, Michael Lang, Scott Pakin, Song Fu
2016 Parallel Computing  
In this paper, we present TracSim, a full-system simulator that enables users to measure trapped power capacity and evaluate the performance of different policies for scheduling parallel tasks under a  ...  TracSim simulates the execution environment of a production HPC cluster at Los Alamos National Laboratory (LANL).  ...  A preliminary version of this paper was presented at International Workshop on Energy Efficient Supercomputing (E2SC) in conjunction with IEEE International Conference for High Performance Computing, Networking  ... 
doi:10.1016/j.parco.2015.11.002 fatcat:ibbpyoxkinfdnmu7x2lavaqom4


Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek, John Wilkes
2013 Proceedings of the 8th ACM European Conference on Computer Systems - EuroSys '13  
Increasing scale and the need for rapid response to changing requirements are hard to meet with current monolithic cluster scheduler architectures.  ...  Monolithic Two-level Shared state cluster machines cluster state information scheduling logic 1 In the public trace for cluster C, these are priority bands 0-8 [27] .  ...  We would like to thank the Mesos team at UC Berkeley for many fruitful and interesting discussions about Mesos, and Joseph Hellerstein for his early work on modeling scheduler interference in Omega.  ... 
doi:10.1145/2465351.2465386 dblp:conf/eurosys/SchwarzkopfKAW13 fatcat:onvfyrf6ybbobgplgtg6eyndcm

A Simulation Platform for Multi-tenant Machine Learning Services on Thousands of GPUs [article]

Ruofan Liang, Bingsheng He, Shengen Yan, Peng Sun
2022 arXiv   pre-print
Specifically, by trace-driven cluster workload simulation, AnalySIM can easily test and analyze various scheduling policies in a number of performance metrics such as GPU resource utilization.  ...  In this demonstration, we present AnalySIM, a cluster simulator that allows efficient design explorations for multi-tenant machine learning services.  ...  The simulator should simulate the real cluster with small errors. 2) Portability. It should be flexible enough for various cluster management policies. 3) Analyzability.  ... 
arXiv:2201.03175v1 fatcat:5u7epp5gbnanfpte5fnb7mzyii

Effective Elastic Scaling of Deep Learning Workloads [article]

Vaibhav Saxena, K. R. Jayaram, Saurav Basu, Yogish Sabharwal, Ashish Verma
2020 arXiv   pre-print
In this paper, we examine the elastic scaling of Deep Learning (DL) jobs over large-scale training platforms and propose a novel resource allocation strategy for DL training jobs, resulting in improved  ...  job run time performance as well as increased cluster utilization.  ...  This is because, with high arrival rate, the baseline is able to schedule only a limited number of jobs as the batch-size is high and thus the jobs cannot be scaled down to a small number of GPUs.  ... 
arXiv:2006.13878v1 fatcat:glqo5f6psvaujmd5ddbec5h2eu

WaxElephant: A Realistic Hadoop Simulator for Parameters Tuning and Scalability Analysis

Zujie Ren, Zhijun Liu, Xianghua Xu, Jian Wan, Weisong Shi, Min Zhou
2012 2012 Seventh ChinaGrid Annual Conference  
MapReduce is becoming the state-of-the-art computation paradigm for processing large-scale datasets on a large cluster with tens or thousands of nodes.  ...  Two challenging issues for a large-scale Hadoop cluster are how to analyze the scalability and identify the optimal parameters configurations.  ...  We are grateful to Jianying, Yunzheng, Wuwei, Tuhai, Zeyuan for their insightful suggestions.  ... 
doi:10.1109/chinagrid.2012.25 dblp:conf/chinagrid/RenLXWSZ12 fatcat:6ed3oni6abcmjhqxyzmmdjoaba

Optimizing cost and performance trade-offs for MapReduce job processing in the cloud

Zhuoyao Zhang, Ludmila Cherkasova, Boon Thau Loo
2014 2014 IEEE Network Operations and Management Symposium (NOMS)  
The results of our simulation study are validated through experiments with Hadoop clusters deployed on different Amazon EC2 instances.  ...  A user can define a set of different SLOs: i) achieving a given completion time for a set of MapReduce jobs while minimizing the cost (budget), or ii) for a given budget select the type and the number  ...  For each platform of choice, i.e., small, medium, and large EC2 instances, and a given Hadoop cluster size, the Job Scheduler component generates the optimized MapReduce job schedule using the Johnson  ... 
doi:10.1109/noms.2014.6838231 dblp:conf/noms/ZhangCL14 fatcat:om773nkvqzg2bgsmv4jttnhzj4

Heterogeneous cores for MapReduce processing: Opportunity or challenge?

Feng Yan, Ludmila Cherkasova, Zhuoyao Zhang, Evgenia Smirni
2014 2014 IEEE Network Operations and Management Symposium (NOMS)  
Our preliminary performance evaluation results confirm potential benefits of heterogeneous multi-core processors for "faster" processing of the small, interactive MapReduce jobs, while at the same time  ...  In this work, we design a new Hadoop scheduler, called DyScale, that exploits capabilities offered by heterogeneous cores for achieving a variety of performance objectives.  ...  For improving the execution time of small MapReduce jobs, one cannot use the scale-out approach, but rather can benefit from the "scale-up" approach, where the tasks comprising such jobs can be executed  ... 
doi:10.1109/noms.2014.6838339 dblp:conf/noms/YanCZS14 fatcat:jliihgk4ozbl3il4slco4r5ihe

Modelling Resilience in Cloud-Scale Data Centres [article]

John Cartlidge, Ilango Sriram
2011 arXiv   pre-print
Here, we present results demonstrating the resilience of different job scheduling algorithms in a simulated DC with hardware failure.  ...  We use a simple model of jobs distributed across a hardware network to demonstrate the relationship between resilience and additional communication costs of different scheduling methods.  ...  ACKNOWLEDGMENTS Financial support for this work came from the EPSRC grant: 6 EP/H042644/17 (for J. Cartlidge) and from Hewlett-Packard's Automated Infrastructure Lab, HP Labs Bristol (for I. Sriram).  ... 
arXiv:1106.5457v1 fatcat:oehr6led6zh6phknjtyfttiurq

DyScale: A MapReduce Job Scheduler for Heterogeneous Multicore Processors

Feng Yan, Ludmila Cherkasova, Zhuoyao Zhang, Evgenia Smirni
2017 IEEE Transactions on Cloud Computing  
Using measurements on an actual experimental setting and via simulation, we argue in favor of heterogeneous multi-core processors as they achieve "faster" (up to 40 percent) processing of small, interactive  ...  We evaluate the performance benefits of DyScale versus the FIFO and Capacity job schedulers that are broadly used in the Hadoop community.  ...  This policy is not efficient for small jobs if large jobs are also present. The Hadoop Fair Scheduler aims to solve this problem.  ... 
doi:10.1109/tcc.2015.2415772 fatcat:ls62o7yt2vbulnpw5zfwzj5lui

A New Fuzzy-based Job Scheduling Algorithm for Cluster Computing

Behzad Azizpour, Mehdi Effatparvar, Mohammad Sadeq Garshasbi
2013 International Journal of Computer Applications  
In this paper, introduce a method based on fuzzy logic for scheduling Parallel jobs on cluster systems. The main objective is to achieve performance and power improvement.  ...  The results of the simulations indicate our introduced method is better than comparison with the algorithm FCFS and SJF. General Terms Cluster Systems, Scheduling, Fuzzy Logic.  ...  CONCLUSION In this study, we proposed the Fuzzy based algorithm for jobs scheduling in Cluster. The proposed method found a better solution for assigning jobs to the Cluster system.  ... 
doi:10.5120/13360-0953 fatcat:bcerh7rxmjcvdkbzqu3fjacr2i

Static Parallel Job Scheduling in Computational Grids

Hamed Vahdat-Nejad, Reza Monsefi
2008 2008 International Conference on Computer and Electrical Engineering  
To efficiently schedule submitted jobs, WAN behavior should be considered as an important parameter, which highly influences the communication time of a job.  ...  The scheduler exploits the capabilities of fuzzy logic to qualitatively deal with different parameters available in the scheduling decision.  ...  I would like to thank them for giving us the opportunity for doing this work.  ... 
doi:10.1109/iccee.2008.175 fatcat:j2diatsjnrcvbghqoi7j42aaqu

Evaluating job packing in warehouse-scale computing

Abhishek Verma, Madhukar Korupolu, John Wilkes
2014 2014 IEEE International Conference on Cluster Computing (CLUSTER)  
But which metric should be used when evaluating schedulers for warehouse-scale (cloud) clusters, which have machines of different types and sizes, heterogeneous workloads with dependencies and constraints  ...  One of the key factors in selecting a good scheduling algorithm is using an appropriate metric for comparing schedulers.  ...  RELATED WORK Job scheduling has seen a rich body of work ranging from kernel level scheduling (on a single computer) to scheduling in grids and large warehouse-scale clusters.  ... 
doi:10.1109/cluster.2014.6968735 dblp:conf/cluster/VermaKW14 fatcat:ofiotyhk5ffzdbptlr6zdptgba

Social Networking Reduces Peak Power Consumption in Smart Grid

Qiuyuan Huang, Xin Li, Jing Zhao, Dapeng Wu, Xiang-Yang Li
2015 IEEE Transactions on Smart Grid  
Then, given a set of jobs of users' appliances to be scheduled in the next scheduling period, we use a distributed scheduling algorithm to minimize the peak power consumption of each group of users.  ...  Index Terms-Power grid, social network, family plan, distributed clustering, trace-driven simulator.  ...  DESIGN OF TRACE-DRIVEN SIMULATOR In this section, we present our design of trace-driven simulator for large-scale simulations in smart grid.  ... 
doi:10.1109/tsg.2014.2379618 fatcat:cv7wxazpb5hj5h5bicxn237yh4

Exploiting Cloud Heterogeneity to Optimize Performance and Cost of MapReduce Processing

Zhuoyao Zhang, Ludmila Cherkasova, Boon Thau Loo
2015 Performance Evaluation Review  
and the job schedule) for processing these jobs within a given deadline while minimizing the rented infrastructure cost.  ...  We aim to solve the following problem: given a completion time target for a set of MapReduce jobs, determine a homogeneous or heterogeneous Hadoop cluster configuration (i.e., the number, types of VMs,  ...  A few MapReduce simulators were introduced for the analysis and exploration of Hadoop cluster configuration and optimized job scheduling decisions.  ... 
doi:10.1145/2788402.2788409 fatcat:5di4bmljmzem5m4g5l7xpbhbla
« Previous Showing results 1 — 15 out of 29,529 results