Filters








587 Hits in 3.7 sec

Balancing Exploration for Online Receding Horizon Learning Control with Provable Regret Guarantees [article]

Deepan Muthirayan, Jianjun Yuan, Pramod P. Khargonekar
2021 arXiv   pre-print
We address the problem of simultaneously learning and control in an online receding horizon control setting.  ...  We propose a novel approach to explore in an online receding horizon setting. The key challenge is to ensure that the control generated by the receding horizon controller is persistently exciting.  ...  Balancing Exploration for Online Receding Horizon Learning Control with Provable Regret Guarantees arXiv:2010.07269v13  ... 
arXiv:2010.07269v14 fatcat:4xi7n4y5dnbnjmqafnarx6o3xy

Online Learning for Predictive Control with Provable Regret Guarantees [article]

Deepan Muthirayan, Jianjun Yuan, Dileep Kalathil, Pramod P. Khargonekar
2022 arXiv   pre-print
Specifically, we study the online learning problem where the control algorithm does not know the true system model and has only access to a fixed-length (that does not grow with the control horizon) preview  ...  We show that under the standard stability assumption for the model estimate, the CE-MPC algorithm achieves 𝒪(T^2/3) dynamic regret.  ...  This online control literature focus on the finite time performance guarantees of the algorithms [3] - [6] .  ... 
arXiv:2111.15041v2 fatcat:he43ygi3bjegzj2we7si2btdg4

Meta-Learning Guarantees for Online Receding Horizon Learning Control [article]

Deepan Muthirayan, Pramod P. Khargonekar
2022 arXiv   pre-print
In this paper we provide provable regret guarantees for an online meta-learning receding horizon control algorithm in an iterative control setting.  ...  By analysing conditions under which sub-linear regret is achievable, we prove that the meta-learning online receding horizon controller achieves an average of the dynamic regret for the controller cost  ...  Our Contribution In this work we propose a model-based meta-learning receding horizon control algorithm for a RHC setting and provide guarantees for its online performance.  ... 
arXiv:2010.11327v14 fatcat:asbudhtn2bex7bqkbetkb4z3ki

Online Learning Robust Control of Nonlinear Dynamical Systems [article]

Deepan Muthirayan, Pramod P. Khargonekar
2021 arXiv   pre-print
We propose an online controller and present guarantees for the metric R^p_t when the maximum possible attenuation is given by γ, which is a system constant.  ...  We also characterize the lower bound on the required prediction horizon for these guarantees to hold in terms of the system constants.  ...  We present: (i) the performance guarantee for the Receding Horizon Controller (RHC) when the controller has preview of the disturbance w k for the horizon M , i.e., for t ≤ k ≤ t + M − 1, and (ii) present  ... 
arXiv:2106.04092v1 fatcat:eswglw6apvbvvdf3zswoden7bq

Smoothed Online Combinatorial Optimization Using Imperfect Predictions [article]

Kai Wang, Zhao Song, Georgios Theocharous, Sridhar Mahadevan
2022 arXiv   pre-print
We show that using predictions to plan for a finite time horizon leads to regret dependent on the total predictive uncertainty and an additional switching cost.  ...  This observation suggests choosing a suitable planning window to balance between uncertainty and switching cost, which leads to an online algorithm with guarantees on the upper and lower bounds of the  ...  The planning window in receding horizon control is also restricted to be fixed across different time steps.  ... 
arXiv:2204.10979v1 fatcat:gknwdwylozebtfrrmv65kz6csq

Online Optimal Control with Linear Dynamics and Predictions: Algorithms and Regret Analysis [article]

Yingying Li, Xin Chen, Na Li
2019 arXiv   pre-print
We design online algorithms, Receding Horizon Gradient-based Control (RHGC), that utilize the predictions through finite steps of gradient computations.  ...  In addition, we provide a fundamental limit of the dynamic regret for any online algorithms by considering linear quadratic tracking problems.  ...  In this paper, we propose novel gradient-based online control algorithms, receding horizon gradientbased control (RHGC), and provide nonasymptotic optimality guarantees by dynamic regrets.  ... 
arXiv:1906.11378v3 fatcat:3cjsuax45zcxzarr7n7neykfaa

A Survey of Optimistic Planning in Markov Decision Processes [chapter]

Lucian Buşoniu, Rémi Munos, Robert Babuška
2013 Reinforcement Learning and Approximate Dynamic Programming for Feedback Control  
We review a class of online planning algorithms for deterministic and stochastic optimal control problems, modeled as Markov decision processes.  ...  An overall recedinghorizon algorithm results, which can also be seen as a type of model-predictive control.  ...  To illustrate the online control performance of optimistic planning, OPSS is applied in a receding-horizon fashion starting from the stable equilibrium of the pendulum (pointing down), and n = 600.  ... 
doi:10.1002/9781118453988.ch22 fatcat:ix4pbp4qpvc2lp44ckd63o3xa4

Using Predictions in Online Optimization

Niangjun Chen, Joshua Comden, Zhenhua Liu, Anshul Gandhi, Adam Wierman
2016 Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science - SIGMETRICS '16  
To this point, two promising algorithms have been proposed: Receding Horizon Control (RHC) and Averaging Fixed Horizon Control (AFHC). The comparison of these policies is largely open.  ...  In this paper, we introduce a new class of policies, Committed Horizon Control (CHC), that generalizes both RHC and AFHC.  ...  Algorithm 1 (Receding Horizon Control). For all t ≤ 0, set xRHC,t = 0.  ... 
doi:10.1145/2896377.2901464 dblp:conf/sigmetrics/ChenCLGW16 fatcat:x6gsq7ny4zdvlgivzo2bd26gyy

Using Predictions in Online Optimization

Niangjun Chen, Joshua Comden, Zhenhua Liu, Anshul Gandhi, Adam Wierman
2016 Performance Evaluation Review  
To this point, two promising algorithms have been proposed: Receding Horizon Control (RHC) and Averaging Fixed Horizon Control (AFHC). The comparison of these policies is largely open.  ...  In this paper, we introduce a new class of policies, Committed Horizon Control (CHC), that generalizes both RHC and AFHC.  ...  Algorithm 1 (Receding Horizon Control). For all t ≤ 0, set xRHC,t = 0.  ... 
doi:10.1145/2964791.2901464 fatcat:kowduhpy75esjibo5rwm6gdhia

Adaptive aggregated predictions for renewable energy systems

Balazs Csanad Csaji, Andras Kovacs, Jozsef Vancza
2014 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)  
The forecasts are made for a prototype public lighting microgrid, which includes photovoltaic panels and LED luminaries that regulate their lighting levels, as inputs for a receding horizon controller.  ...  The predictions can be further improved by combining the forecasts of several models using online learning, the framework of prediction with expert advice.  ...  Particularly, we apply a receding horizon controller, namely, in each step we compute an open-loop control sequence for a given horizon T and the environmental feedback is incorporated by recalculating  ... 
doi:10.1109/adprl.2014.7010625 dblp:conf/adprl/CsajiKV14 fatcat:uwldn4hdgffsndebmr3l72ib2m

Improving Tractability of Real-Time Control Schemes via Simplified 𝒮-Lemma [article]

Goran Banjac, Jianzhe Zhen, Dick den Hertog, John Lygeros
2020 arXiv   pre-print
However, the computational effort required to solve the resulting semidefinite program may be prohibitively large for real-time applications requiring a repeated solution of such a problem.  ...  Various control schemes rely on a solution of a convex optimization problem involving a particular robust quadratic constraint, which can be reformulated as a linear matrix inequality using the well-known  ...  Acknowledgements We are grateful to Ahmed Aboudonia for helpful discussions on reconfigurable terminal constraints in MPC.  ... 
arXiv:2012.04688v1 fatcat:cwfywm2v25hsjjjhwcb4m24aoi

Smoothed Online Convex Optimization in High Dimensions via Online Balanced Descent [article]

Niangjun Chen, Gautam Goel, Adam Wierman
2018 arXiv   pre-print
We demonstrate the generality of the OBD framework by showing how, with different choices of "balance," OBD can improve upon state-of-the-art performance guarantees for both competitive ratio and regret  ...  We study Smoothed Online Convex Optimization, a version of online convex optimization where the learner incurs a penalty for changing her actions between rounds.  ...  Currently, the only unifying frameworks for SOCO rely on the use of predictions, and use approaches based on receding horizon control, e.g., Chen et al. (2015) ; Badiei et al. (2015) ; Chen et al. (  ... 
arXiv:1803.10366v2 fatcat:plqeahnygfhcbfhi3yeq6j2yom

Online Optimization with Predictions and Switching Costs: Fast Algorithms and the Fundamental Limit [article]

Yingying Li, Guannan Qu, Na Li
2020 arXiv   pre-print
We propose two gradient-based online algorithms: Receding Horizon Gradient Descent (RHGD), and Receding Horizon Accelerated Gradient (RHAG).  ...  Moreover, we study the fundamental lower bound on the dynamic regret for a broad class of deterministic online algorithms.  ...  Receding Horizon Gradient Descent (RHGD) Inspired by the offline gradient descent, we design our online RHGD (see Algorithm 1).  ... 
arXiv:1801.07780v4 fatcat:o5bs7pk2l5aa5mk3azpbwxnfz4

Performance and safety of Bayesian model predictive control: Scalable model-based RL with guarantees [article]

Kim P. Wabersich, Melanie N. Zeilinger
2020 arXiv   pre-print
data-driven control of constrained dynamical systems.  ...  The reason for this unexplored potential is partly related to the significant required tuning effort, large numbers of required learning episodes, i.e. experiments, and the limited availability of RL methods  ...  For long task horizons T , another common approximation in MPC is to select a smaller prediction horizon and to operate in a receding horizon fashion, see e.g. [4, Section 2.2].  ... 
arXiv:2006.03483v2 fatcat:ydu4dkujrvg2vod6353cty47ky

Regret-optimal control in dynamic environments [article]

Gautam Goel, Babak Hassibi
2021 arXiv   pre-print
Unlike most prior work in this area, we focus on the problem of designing an online controller which minimizes regret against the best dynamic sequence of control actions selected in hindsight (dynamic  ...  regret), instead of the best fixed controller in some specific class of controllers (static regret).  ...  It would also be interesting to consider systems with nonlinear dynamics and measure the performance of a receding-horizon control policy which iteratively linearizes the system dynamics and selects the  ... 
arXiv:2010.10473v2 fatcat:2zd24laxpveq5ninzvrnjkqpje
« Previous Showing results 1 — 15 out of 587 results