3 Hits in 2.2 sec

HyperSched: Dynamic Resource Reallocation for Model Development on a Deadline [article]

Richard Liaw, Romil Bhardwaj, Lisa Dunlap, Yitian Zou, Joseph Gonzalez, Ion Stoica, Alexey Tumanov
2020 arXiv   pre-print
To optimally trade-off evaluating multiple configurations and training the most promising ones by a fixed deadline, we design and build HyperSched -- a dynamic application-level resource scheduler to track  ...  Prior research in resource scheduling for machine learning training workloads has largely focused on minimizing job completion times.  ...  ACKNOWLEDGEMENTS We thank our shepherd Srinivasan Parthasarathy and the anonymous reviewers for their valuable feedback and suggestions to improve this work.  ... 
arXiv:2001.02338v1 fatcat:2dvde4emhnhr3ggxyypsc2m3ce

Resource Allocation in Multi-armed Bandit Exploration: Overcoming Sublinear Scaling with Adaptive Parallelism [article]

Brijen Thananjeyan, Kirthevasan Kandasamy, Ion Stoica, Michael I. Jordan, Ken Goldberg, Joseph E. Gonzalez
2021 arXiv   pre-print
Second, we present an algorithm for a fixed deadline setting, where we are given a time deadline and need to maximize the probability of finding the best arm.  ...  For example, in simulation-based scientific studies, an expensive simulation can be sped up by running it on multiple cores.  ...  Hypersched: Dynamic resource reallocation for model development on a deadline. In Proceedings of the ACM Symposium on Cloud Computing, pages 61-73, 2019. Herbert Robbins.  ... 
arXiv:2011.00330v2 fatcat:l35y63hwi5f2nnmgug7kxqlzqy

Holistic Runtime Scheduling for the Distributed Computing Landscape

Marcel Blöcher
Internet services have become an indispensable part of our lives, with billions of users on a daily basis.  ...  A straightforward strategy to provide services with high availability is to allocate dedicated resources for each service.  ...  HyperSched focuses on machine learning training workloads and enables the automatic exploration of the optimal tradeoff between hyper-parameter configurations and training deadline guarantees [Lia+19]  ... 
doi:10.26083/tuprints-00018576 fatcat:yhndjijxcjb6bn2h45c6r4aqia