IA Scholar Query: Explicit M/G/1 waiting-time distributions for a class of long-tail service-time distributions.
https://scholar.archive.org/
Internet Archive Scholar query results feedeninfo@archive.orgMon, 24 Oct 2022 00:00:00 GMTfatcat-scholarhttps://scholar.archive.org/help1440A New Toolbox for Scheduling Theory
https://scholar.archive.org/work/drslrwk5cnb3rmeh3wxaw7ffda
Queueing delays are ubiquitous in many domains, including computer systems, service systems, communication networks, supply chains, and transportation. Queueing and scheduling theory provide a rigorous basis for understanding how to reduce delays with scheduling, including evaluating policy performance and guiding policy design. Unfortunately, state-of-the-art theory fails to address many practical concerns. For example, scheduling theory seldom treats nontrivial preemption limitations, and there is very little theory for scheduling in multiserver queues. We present two new, broadly applicable tools that greatly expand the reach of scheduling theory, using each to solve multiple open problems. The first tool, called "SOAP", is a new unifying theory of scheduling in single-server queues, specifically the M/G/1 model. SOAP characterizes the delay distribution of a broad space of policies, most of which have never been analyzed before. Such policies include the Gittins index policy, which minimizes mean delay in low-information settings, and many policies with preemption limitations. The second tool, called "WINE", is a new queueing identity that complements Little's law. WINE enables a new method of analyzing complex queueing systems by relating them to simpler systems. This results in the first delay bounds for SRPT (shortest remaining processing time) and the Gittins index policy in multiserver queues, specifically the M/G/k model.Ziv Scullywork_drslrwk5cnb3rmeh3wxaw7ffdaMon, 24 Oct 2022 00:00:00 GMTModel-based resource management for fine-grained services
https://scholar.archive.org/work/uumefncwjze4xodsqtiuwr2zhy
The emergence of DevOps has changed the way modern distributed software systems are developed. Architectures decomposed in fine-grained services, such as microservices or function-as-a-service (FaaS), are now widespread across many organizations. From a resource management perspective, although the systems built with such architectures have many benefits, there are still research challenges that need further attention. In this study, we have focused on three such challenges, each concerning a specific system resource: compute, memory, or storage. Firstly, we focus on scaling the capacity of microservices at runtime. Here, the challenge is to design an autoscaler that can decide between vertical and horizontal scaling options to distribute the CPU capacity. Secondly, we focus on estimating the required capacity of an on-premises FaaS platform such that the service level agreements (SLAs) for function response times are satisfied. The challenge here is to address the cold start dilemma, i.e., that a cold start delays a function response but reduces the memory consumption. Thus, we must find a limit of cold starts such that the memory-consumption remains in-check while satisfying the SLAs. Finally, we focus on the storage management for distributed tracing targeted at microservices. The volume of such traces generated in a data center can be in the scale of tens of terabytes per day, but only a small fraction of these traces is useful for troubleshooting. The objective then is to sample only the useful traces. The key to addressing all these challenges is first, modeling the dynamics concerning the resources and subsequently, leveraging the model in a resource controller. To address the first challenge, we have developed an autoscaler ATOM that leverages layered queueing network (LQN) models to take its scaling decisions. Our experiment, with a real-life application, shows that ATOM produces 30-37% better results than the baseline autoscalers. For the second challenge, we have developed COCOA, a cold start aware cap [...]Alim Ul Gias, Giuliano Casale, Commonwealth Scholarship Commission In The UKwork_uumefncwjze4xodsqtiuwr2zhyFri, 09 Sep 2022 00:00:00 GMTAn online learning approach to dynamic pricing and capacity sizing in service systems
https://scholar.archive.org/work/gvfe3cj765hpvo4hsdgsq4gavm
We study a dynamic pricing and capacity sizing problem in a GI/GI/1 queue, where the service provider's objective is to obtain the optimal service fee p and service capacity μ so as to maximize the cumulative expected profit (the service revenue minus the staffing cost and delay penalty). Due to the complex nature of the queueing dynamics, such a problem has no analytic solution so that previous research often resorts to heavy-traffic analysis where both the arrival rate and service rate are sent to infinity. In this work we propose an online learning framework designed for solving this problem which does not require the system's scale to increase. Our framework is dubbed Gradient-based Online Learning in Queue (GOLiQ). GOLiQ organizes the time horizon into successive operational cycles and prescribes an efficient procedure to obtain improved pricing and staffing policies in each cycle using data collected in previous cycles. Data here include the number of customer arrivals, waiting times, and the server's busy times. The ingenuity of this approach lies in its online nature, which allows the service provider do better by interacting with the environment. Effectiveness of GOLiQ is substantiated by (i) theoretical results including the algorithm convergence and regret analysis (with a logarithmic regret bound), and (ii) engineering confirmation via simulation experiments of a variety of representative GI/GI/1 queues.Xinyun Chen, Yunan Liu, Guiyu Hongwork_gvfe3cj765hpvo4hsdgsq4gavmWed, 07 Sep 2022 00:00:00 GMTA Model of Job Parallelism for Latency Reduction in Large-Scale Systems
https://scholar.archive.org/work/dlsqwswpjvevpao33edbyfvpru
Processing computation-intensive jobs at multiple processing cores in parallel is essential in many real-world applications. In this paper, we consider an idealised model for job parallelism in which a job can be served simultaneously by d distinct servers. The job is considered complete when the total amount of work done on it by the d servers equals its size. We study the effect of parallelism on the average delay of jobs. Specifically, we analyze a system consisting of n parallel processor sharing servers in which jobs arrive according to a Poisson process of rate n λ (λ <1) and each job brings an exponentially distributed amount of work with unit mean. Upon arrival, a job selects d servers uniformly at random and joins all the chosen servers simultaneously. We show by a mean-field analysis that, for fixed d ≥ 2 and large n, the average occupancy of servers is O(log (1/(1-λ))) as λ→ 1 in comparison to O(1/(1-λ)) average occupancy for d=1. Thus, we obtain an exponential reduction in the response time of jobs through parallelism. We make significant progress towards rigorously justifying the mean-field analysis.Ayalvadi Ganesh, Arpan Mukhopadhyaywork_dlsqwswpjvevpao33edbyfvpruWed, 20 Jul 2022 00:00:00 GMTSize-based prioritization in Network Utility Maximization and its applicability to decentralized elastic electricity demand scheduling
https://scholar.archive.org/work/abvhest6czdtnkcjii4tgulooe
Internet communications and electricity consumption both require access to scarce resources while users expect them to occur without delays. This thesis proposes a common scheduling framework that aims at minimizing response time in both scenarios.BRUNO LUIS MENDIVEZ VASQUEZwork_abvhest6czdtnkcjii4tgulooeWed, 29 Jun 2022 00:00:00 GMTOn the stochastic and asymptotic improvement of First-Come First-Served and Nudge scheduling
https://scholar.archive.org/work/6x2eglhxh5cyvklu76a2blvk34
Recently it was shown that, contrary to expectations, the First-Come-First-Served (FCFS) scheduling algorithm can be stochastically improved upon by a scheduling algorithm called Nudge for light-tailed job size distributions. Nudge partitions jobs into 4 types based on their size, say small, medium, large and huge jobs. Nudge operates identical to FCFS, except that whenever a small job arrives that finds a large job waiting at the back of the queue, Nudge swaps the small job with the large one unless the large job was already involved in an earlier swap. In this paper, we show that FCFS can be stochastically improved upon under far weaker conditions. We consider a system with 2 job types and limited swapping between type-1 and type-2 jobs, but where a type-1 job is not necessarily smaller than a type-2 job. More specifically, we introduce and study the Nudge-K scheduling algorithm which allows type-1 jobs to be swapped with up to K type-2 jobs waiting at the back of the queue, while type-2 jobs can be involved in at most one swap. We present an explicit expression for the response time distribution under Nudge-K when both job types follow a phase-type distribution. Regarding the asymptotic tail improvement ratio (ATIR) , we derive a simple expression for the ATIR, as well as for the K that maximizes the ATIR. We show that the ATIR is positive and the optimal K tends to infinity in heavy traffic as long as the type-2 jobs are on average longer than the type-1 jobs.Benny Van Houdtwork_6x2eglhxh5cyvklu76a2blvk34Tue, 21 Jun 2022 00:00:00 GMTStochastic approximation of symmetric Nash equilibria in queueing games
https://scholar.archive.org/work/ak3fn5rpzfbbxeruhaqekcg2jm
We suggest a novel stochastic-approximation algorithm to compute a symmetric Nash-equilibrium strategy in a general queueing game with a finite action space. The algorithm involves a single simulation of the queueing process with dynamic updating of the strategy at regeneration times. Under mild assumptions on the utility function and on the regenerative structure of the queueing process, the algorithm converges to a symmetric equilibrium strategy almost surely. This yields a powerful tool that can be used to approximate equilibrium strategies in a broad range of strategic queueing models in which direct analysis is impracticable.Liron Ravner, Ran I. Snitkovskywork_ak3fn5rpzfbbxeruhaqekcg2jmMon, 20 Jun 2022 00:00:00 GMTWCFS: A new framework for analyzing multiserver systems
https://scholar.archive.org/work/56cwqwn2jrdx3g3m6lir6ad6wu
Multiserver queueing systems are found at the core of a wide variety of practical systems. Many important multiserver models have a previously-unexplained similarity: identical mean response time behavior is empirically observed in the heavy traffic limit. We explain this similarity for the first time. We do so by introducing the work-conserving finite-skip (WCFS) framework, which encompasses a broad class of important models. This class includes the heterogeneous M/G/k, the limited processor sharing policy for the M/G/1, the threshold parallelism model, and the multiserver-job model under a novel scheduling algorithm. We prove that for all WCFS models, scaled mean response time E[T](1-ρ) converges to the same value, E[S^2]/(2E[S]), in the heavy-traffic limit, which is also the heavy traffic limit for the M/G/1/FCFS. Moreover, we prove additively tight bounds on mean response time for the WCFS class, which hold for all load ρ. For each of the four models mentioned above, our bounds are the first known bounds on mean response time.Isaac Grosof, Mor Harchol-Balter, Alan Scheller-Wolfwork_56cwqwn2jrdx3g3m6lir6ad6wuSun, 12 Jun 2022 00:00:00 GMTDesigning optimal allocations for cancer screening using queuing network models
https://scholar.archive.org/work/xywbrobjdbhzvdpc6znoh43zrq
Cancer is one of the leading causes of death, but mortality can be reduced by detecting tumors earlier so that treatment is initiated at a less aggressive stage. The tradeoff between costs associated with screening and its benefit makes the decision of whom to screen and when a challenge. To enable comparisons across screening strategies for any cancer type, we demonstrate a mathematical modeling platform based on the theory of queuing networks designed for quantifying the benefits of screening strategies. Our methodology can be used to design optimal screening protocols and to estimate their benefits for specific patient populations. Our method is amenable to exact analysis, thus circumventing the need for simulations, and is capable of exactly quantifying outcomes given variability in the age of diagnosis, rate of progression, and screening sensitivity and intervention outcomes. We demonstrate the power of this methodology by applying it to data from the Surveillance, Epidemiology and End Results (SEER) program. Our approach estimates the benefits that various novel screening programs would confer to different patient populations, thus enabling us to formulate an optimal screening allocation and quantify its potential effects for any cancer type and intervention.Justin Dean, Evan Goldberg, Franziska Michor, Attila Csikász-Nagywork_xywbrobjdbhzvdpc6znoh43zrqFri, 27 May 2022 00:00:00 GMTStochastic Review Inventory Systems with Deteriorating Items; A Steady-State Non-Linear Approach
https://scholar.archive.org/work/bbe5cneebre3fjuux7p3fw44ga
The primary goal of business organization is optimally maximizing their productivity and profit whilst reducing the cost resulting from lost sales and services given to their customers, which can be achieved by exceeding the balance between the demand and supply. Analyzing real-world situations, including integrated queuing-inventory systems, such as M/M/1-systems and M/M/1/∞-systems, can help business organizations reach this goal. This research analyzes integrated queuing-inventory systems with lost sales validated under a deterministic and uniformly distributed order size scheme under continuous review. The limited integrated inventory-queuing M/M/1/N-1-system was chosen as subject of our interest due to its closeness to reality. Thus, this system with exponentially distributed deteriorating products and random planning time with lost sales was simulated. This research aimed to analyze customers' sanctification by studying the addition of the deterioration parameter γ to the model under consideration. The proposed model's demand was based on Poisson, wherein service times and lead times are exponentially distributed. We also examined M/M/1/∞ and M/M/1/N-1-systems investigated by Shwarz et al. using the proposed method to solve the linear system of equations obtained from the steady-state system balance equations results obtained are compared to those obtained from simulating the Schwarz approach. The analyzed model was tested for different values of Q, demand rate λ, and γ. The obtained results showed a strong dependency between γ, Q, and λ, providing the needed information for decision-makers to reach their goals depending on the performance measure of interest.Adel F. Alrasheedi, Khalid A. Alnowibet, Ibtisam T. Alotaibiwork_bbe5cneebre3fjuux7p3fw44gaSat, 16 Apr 2022 00:00:00 GMTPrecoding and Scheduling for AoI Minimization in MIMO Broadcast Channels
https://scholar.archive.org/work/nhavm7a27raehclrihgm6scemu
In this paper, we consider a status updating system where updates are generated at a constant rate at K sources and sent to the corresponding recipients through a noise-free broadcast channel. We assume that perfect channel state information (CSI) is available at the transmitter before each transmission, and the transmitter is able to utilize the CSI information to precode the updates. Our object is to design optimal precoding schemes to minimize the summed average age of information (AoI) at the recipients. Under various assumptions on the size of each update B, the number of transmit antennas M, and the number of receive antennas N at each user, this paper identifies the corresponding age-optimal precoding and transmission scheduling strategies. Specifically, for the case when N=1, a round-robin based updating scheme is shown to be optimal. For the two-user systems with N>B or M∉[N:2N], framed updating schemes are proven to be optimal. For other cases in the two-user systems, a framed alternating updating scheme is proven to be 2-optimal.Songtao Feng, Jing Yangwork_nhavm7a27raehclrihgm6scemuTue, 15 Mar 2022 00:00:00 GMTGeneralized Bayesian Likelihood-Free Inference Using Scoring Rules Estimators
https://scholar.archive.org/work/snucxyqpjfgure4mx6zqg4pt3u
We propose a framework for Bayesian Likelihood-Free Inference (LFI) based on Generalized Bayesian Inference. To define the generalized posterior, we use Scoring Rules (SRs), which evaluate probabilistic models given an observation. In LFI, we can sample from the model but not evaluate the likelihood; for this reason, we employ SRs with easy empirical estimators. Our framework includes novel approaches and popular LFI techniques (such as Bayesian Synthetic Likelihood) and enjoys posterior consistency in a well-specified setting when a strictly-proper SR is used (i.e., one whose expectation is uniquely minimized when the model corresponds to the data generating process). In general, our framework does not approximate the standard posterior; as such, it is possible to achieve outlier robustness, which we prove is the case for the Kernel and Energy Scores. We also discuss a strategy for tuning the learning rate in the generalized posterior suitable for the LFI setup. We run simulations studies with correlated pseudo-marginal Markov Chain Monte Carlo and compare with related approaches on standard benchmarks and challenging intractable-likelihood models from meteorology and ecology.Lorenzo Pacchiardi, Ritabrata Duttawork_snucxyqpjfgure4mx6zqg4pt3uTue, 15 Mar 2022 00:00:00 GMTHigh-Priority Expected Waiting Times in the Delayed Accumulating Priority Queue with Applications to Health Care KPIs
https://scholar.archive.org/work/7zt3wwc4wrfuxgldb367t4knwa
We provide the first analytical expressions for the expected waiting time of high-priority customers in the delayed APQ by exploiting a classical conservation law for work-conserving queues. Additionally, we describe an algorithm to compute the expected waiting times of both low-priority and high-priority customers, which requires only the truncation of sums that converge quickly in our experiments. These insights are used to demonstrate how the accumulation rate and delay level should be chosen by health care practitioners to optimize common key performance indicators (KPIs). In particular, we demonstrate that for certain nontrivial KPIs, an accumulating priority queue with a delay of zero is always preferable. Finally, we present a detailed investigation of the quality of an exponential approximation to the high-priority waiting time distribution, which we use to optimize the choice of queueing parameters with respect to both classes' waiting time distributions.Blair Bilodeau, David A. Stanfordwork_7zt3wwc4wrfuxgldb367t4knwaThu, 10 Feb 2022 00:00:00 GMTCan machines solve general queueing systems?
https://scholar.archive.org/work/sp5lmj6lsndopb45tiwfgmyvdi
In this paper, we analyze how well a machine can solve a general problem in queueing theory. To answer this question, we use a deep learning model to predict the stationary queue-length distribution of an M/G/1 queue (Poisson arrivals, general service times, one server). To the best of our knowledge, this is the first time a machine learning model is applied to a general queueing theory problem. We chose M/G/1 queue for this paper because it lies "on the cusp" of the analytical frontier: on the one hand exact solution for this model is available, which is both computationally and mathematically complex. On the other hand, the problem (specifically the service time distribution) is general. This allows us to compare the accuracy and efficiency of the deep learning approach to the analytical solutions. The two key challenges in applying machine learning to this problem are (1) generating a diverse set of training examples that provide a good representation of a "generic" positive-valued distribution, and (2) representations of the continuous distribution of service times as an input. We show how we overcome these challenges. Our results show that our model is indeed able to predict the stationary behavior of the M/G/1 queue extremely accurately: the average value of our metric over the entire test set is 0.0009. Moreover, our machine learning model is very efficient, computing very accurate stationary distributions in a fraction of a second (an approach based on simulation modeling would take much longer to converge). We also present a case-study that mimics a real-life setting and shows that our approach is more robust and provides more accurate solutions compared to the existing methods. This shows the promise of extending our approach beyond the analytically solvable systems (e.g., G/G/1 or G/G/c).Eliran Sherzer, Arik Senderovich, Opher Baron, Dmitry Krasswork_sp5lmj6lsndopb45tiwfgmyvdiThu, 03 Feb 2022 00:00:00 GMTExtremal Queueing Theory
https://scholar.archive.org/work/fa3giysyo5bbpcbgbbrfr6zwoe
Extremal Queueing Theory Yan Chen Queueing theory has often been applied to study communication and service queueing systems such as call centers, hospital emergency departments and ride-sharing platforms.Yan Chenwork_fa3giysyo5bbpcbgbbrfr6zwoeA Foreground-Background queueing model with speed or capacity modulation
https://scholar.archive.org/work/pr4zimmq25c4xcpxb6a6bstuna
The models studied in the steady state involve two queues which are served either by a single server whose speed depends on the number of jobs present, or by several parallel servers whose number may be controlled dynamically. Job service times have a two-phase Coxian distribution and the second phase is given lower priority than the first. The trade-offs between holding costs and energy consumption costs are examined by means of a suitable cost functions. Two different two-dimensional Markov process are solved exactly. The solutions are used in several numerical experiments. Some counter-intuitive results are observed.Andrea Marin, Isi Mitraniwork_pr4zimmq25c4xcpxb6a6bstunaWed, 01 Dec 2021 00:00:00 GMT