Filters








39,481 Hits in 7.4 sec

Intelligent fitting global real‐time task scheduling strategy for high‐performance multi‐core systems

Junpeng Wu, Enyuan Zhao, Sizhao Li, Yanqiang Wang
2021 CAAI Transactions on Intelligence Technology  
The results show that the intelligently fitting global scheduling strategy for multi-core systems has better performance in the nuclear utilisation rate and task schedulability.  ...  With the development of high-performance computing, it is possible to solve large-scale computing problems.  ...  When the system scale is large, the increase in the number of tasks will not cause the scheduling overhead to increase sharply.  ... 
doi:10.1049/cit2.12063 fatcat:cr7nzgugsngttmwtig7ggvktga

Collaborative Localization and Tracking with Minimal Infrastructure

Yanjun Cao, David St-Onge, Giovanni Beltrame
2020 2020 18th IEEE International New Circuits and Systems Conference (NEWCAS)  
We present a strategy that autonomously shares the UWB network among devices and allows fast and accurate localization and tracking.  ...  This paper aims at covering a large number of devices distributed in many of small rooms, with minimal localization infrastructure.  ...  This logic prevents conflicts to emerge in the schedule, even with hidden nodes.  ... 
doi:10.1109/newcas49341.2020.9159784 dblp:conf/newcas/CaoSB20 fatcat:ajq2kwvvjnah3h6rcfjtt6nlf4

Twine: A Unified Cluster Management System for Shared Infrastructure

Chunqiang Tang, Kenny Yu, Kaushik Veeraraghavan, Jonathan Kaldor, Scott Michelson, Thawan Kooburat, Aravind Anbudurai, Matthew Clark, Kabir Gogia, Long Cheng, Ben Christensen, Alex Gartrell (+8 others)
2020 USENIX Symposium on Operating Systems Design and Implementation  
For instance, rather than deploying an isolated control plane per cluster, Twine scales a single control plane to manage one million machines across all data centers in a geographic region and transparently  ...  Twine has helped convert our infrastructure from a collection of siloed pools of customized machines dedicated to individual workloads, into a large-scale shared infrastructure with fungible hardware.  ...  We thank Niket Agarwal, Marius Eriksen, Tianyin Xu, Murray Stokely, Seth Hettich, and the OSDI reviewers for their insightful feedback.  ... 
dblp:conf/osdi/TangYVKMKACGCCG20 fatcat:bcgo24bwtvfvhklmau6msuzdaa

AntMan: Dynamic Scaling on GPU Clusters for Deep Learning

Wencong Xiao, Shiru Ren, Yong Li, Yang Zhang, Pengyang Hou, Zhi Li, Yihui Feng, Wei Lin, Yangqing Jia
2020 USENIX Symposium on Operating Systems Design and Implementation  
Efficiently scheduling deep learning jobs on large-scale GPU clusters is crucial for job performance, system throughput, and hardware utilization.  ...  This paper presents AntMan, a deep learning infrastructure that co-designs cluster schedulers with deep learning frameworks and has been deployed in production at Alibaba to manage tens of thousands of  ...  We would also like to thank Chen Xing, Jin Ouyang, Xinyuan Li, Lixue Xia for their help in improving quality of writing.  ... 
dblp:conf/osdi/XiaoRLZHLFLJ20 fatcat:lfa2xrj7zveulikmjxzi66e7fm

Optimizing big data processing performance in the public cloud: opportunities and approaches

Dan Wang, Jiangchuan Liu
2015 IEEE Network  
For grass root users or non-computing professionals, the cost for deploying and maintaining a large-scale dedicated server clusters can be prohibitively high, not to mention the technical skills involved  ...  We identify the key differences between running big data processing in a public cloud and in dedicated server clusters.  ...  They rely on a large scale of machines that work together, a.k.a, scale-out, to process the big data in a divide-and-conquer manner.  ... 
doi:10.1109/mnet.2015.7293302 fatcat:kzpwmqzbkbhv3dafry4ye2iaja

Adaptive scheduling under memory pressure on multiprogrammed SMPs

D. S. Nikolopoulos, C. D. Polychronopoulos
2002 Proceedings 16th International Parallel and Distributed Processing Symposium  
We present a simple scheduling strategy that copes with the adverse effects of paging on multiprogrammed SMPs.  ...  with the operating system through a lightweight interface; it is preventive, because it takes scheduling actions before paging occurs; and it is non-intrusive, because the local scheduling actions taken  ...  We present a simple scheduling strategy that attempts to prevent paging.  ... 
doi:10.1109/ipdps.2002.1015481 dblp:conf/ipps/NikolopoulosP02 fatcat:cnghzwptbjdybdllmqkwczeqpi

A Taxonomy of Schedulers – Operating Systems, Clusters and Big Data Frameworks

Leszek Sliwko
2019 Global Journal of Computer Science and Technology  
This review analyzes deployed and actively used workload schedulers' solutions and presents a taxonomy in which those systems are divided into several hierarchical groups based on their architecture and  ...  While other taxonomies do exist, this review has focused on the key design factors that affect the throughput and scalability of a given solution, as well as the incremental improvements which bettered  ...  Fernando Corbató was awarded the Turing Award by the ACM in 1990 'for his pioneering work organizing the concepts and leading the development of the general-purpose, large-scale, time-sharing and resource-sharing  ... 
doi:10.34257/gjcstbvol19is1pg25 fatcat:knhmbtwpzbdadjmjwoqoyhemoe

Productive Efficiency of Energy-Aware Data Centers

Damián Fernández-Cerero, Alejandro Fernández-Montes, Francisco Velasco
2018 Energies  
We identify the best energy policies and scheduling strategies for high and low data-center demands and for medium-sized and large data-centers; moreover, this work enables data-center managers to detect  ...  This analysis evaluates energy consumption and performance indicators for natural DEA and constant returns to scale (CRS).  ...  Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/en11082053 fatcat:avydppg4t5e6poprzeongivmfm

Reactive provisioning of backend databases in shared dynamic content server clusters

Gokul Soundararajan, Cristiana Amza
2006 ACM Transactions on Autonomous and Adaptive Systems  
This paper introduces a self-configuring architecture for on-demand resource allocation to applications in a shared database cluster.  ...  We design an efficient method for data migration when joining a new replica to a running application that allows for the quick addition of replicas with minimal disruption of transaction processing.  ...  Conflict-Aware Replication The key idea in conflict-aware replication is to augment the scheduler that distributes queries on the database cluster with reliable state in such a way as to optimize performance  ... 
doi:10.1145/1186778.1186780 fatcat:mbrw3sc4cna4jh6ppq4knrenvm

Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing

Eric Boutin, Jaliya Ekanayake, Wei Lin, Bing Shi, Jingren Zhou, Zhengping Qian, Ming Wu, Lidong Zhou
2014 USENIX Symposium on Operating Systems Design and Implementation  
Efficiently scheduling data-parallel computation jobs over cloud-scale computing clusters is critical for job performance, system throughput, and resource utilization.  ...  The framework performs scheduling decisions in a distributed manner, utilizing global cluster information via a loosely coordinated mechanism.  ...  Acknowledgements We are grateful to our shepherd Andrew Warfield for his guidance in the revision process and to the anonymous reviewers for their insightful comments.  ... 
dblp:conf/osdi/BoutinELSZQWZ14 fatcat:rqcpw5zh3jh7tjwk3gxwsthazm

Computing at Massive Scale: Scalability and Dependability Challenges

Renyu Yang, Jie Xu
2016 2016 IEEE Symposium on Service-Oriented System Engineering (SOSE)  
Large-scale Cloud systems and big data analytics frameworks are now widely used for practical services and applications.  ...  We first introduce a data-driven analysis methodology for characterizing the resource and workload patterns and tracing performance bottlenecks in a massive-scale distributed computing environment.  ...  We would also like to extend our sincere thanks to the entire SIGRS group from Beihang University, DSS group from University of Leeds, and the Fuxi Distributed resource scheduling team in Alibaba Cloud  ... 
doi:10.1109/sose.2016.73 dblp:conf/sose/YangX16 fatcat:bsbdpnfzpnf5jbl2d3hobd7adu

Limiting global warming by improving data-centre software

Damian Fernandez-Cerero, Alejandro Fernandez-Montes, Agnieszka Jakobik
2020 IEEE Access  
Carbon emissions, greenhouse gases and pollution in general are usually related to traditional factories, so the most modern computing factories have gone unnoticed for the general-public opinion.  ...  To this end, this work is focused on the proposal and analysis of a set of energy-efficiency policies which are applied to traditional and hyper-scale data centres, as well as numerous operation environments  ...  Therefore, each scheduler lacks the global cluster state and tasks requirements which may lead to sub-optimal scheduling decisions. 3) SHARED-STATE CENTRALISED RESOURCE MANAGERS In contrast with the  ... 
doi:10.1109/access.2020.2978306 fatcat:clvrc6pnkzfbzj5uyng4kylh5q

Emergent Failures: Rethinking Cloud Reliability at Scale

Peter Garraghan, Renyu Yang, Zhenyu Wen, Alexander Romanovsky, Jie Xu, Rajkumar Buyya, Rajiv Ranjan
2018 IEEE Cloud Computing  
However, as these systems have continued to grow in scale, heterogeneity and complexity resulting in the manifestation of emergent behaviour, so too have their respective failures.  ...  This work identifies the challenges of emergent failures within Cloud datacenters at scale, their impact upon system resource management, and discusses potential directions of further study for IoT integration  ...  in the slowdown of request handling, and latetiming state mismatch in the state manager leading to the scheduling conflicts.  ... 
doi:10.1109/mcc.2018.053711662 fatcat:b6h2ayfywjdergwfngesyg4mjy

Efficient System-Enforced Deterministic Parallelism [article]

Amittai Aviram, Shu-Chun Weng, Sen Hu, Bryan Ford
2010 arXiv   pre-print
The system runs parallel applications deterministically both on multicore PCs and across nodes in a cluster.  ...  Coarse-grained parallel benchmarks perform and scale comparably to - sometimes better than - conventional systems, though determinism is costly for fine-grained parallel applications.  ...  The master space, required to enforce a total order on synchronization operations, may be a scaling bottleneck unless execution quanta are large.  ... 
arXiv:1005.3450v1 fatcat:2zzzgu5q3vcltklzvmlc6b4ccq

Performance-Aware Scheduling of Parallel Applications on Non-Dedicated Clusters

Alberto Cascajo, David E. Singh, Jesus Carretero
2019 Electronics  
This work presents a HPC framework that provides new strategies for resource management and job scheduling, based on executing different applications in shared compute nodes, maximizing platform utilization  ...  We also introduce an extension of CLARISSE, a middleware for data-staging coordination and control on large-scale HPC platforms that uses the information provided by the monitor in combination with application-level  ...  Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/electronics8090982 fatcat:yh2f3uyqtzanlec53kcny4zn6q
« Previous Showing results 1 — 15 out of 39,481 results