Filters








599 Hits in 5.1 sec

Meta-Learning Guarantees for Online Receding Horizon Learning Control [article]

Deepan Muthirayan, Pramod P. Khargonekar
2022 arXiv   pre-print
In this paper we provide provable regret guarantees for an online meta-learning receding horizon control algorithm in an iterative control setting.  ...  By analysing conditions under which sub-linear regret is achievable, we prove that the meta-learning online receding horizon controller achieves an average of the dynamic regret for the controller cost  ...  For this setting we propose and study an online model-based meta-learning Receding Horizon Control (RHC) algorithm.  ... 
arXiv:2010.11327v14 fatcat:asbudhtn2bex7bqkbetkb4z3ki

Cost Adaptation for Robust Decentralized Swarm Behaviour [article]

Peter Henderson, Matthew Vertescher, David Meger, Mark Coates
2018 arXiv   pre-print
Decentralized receding horizon control (D-RHC) provides a mechanism for coordination in multi-agent settings without a centralized command center.  ...  To allay this problem, we use a meta-learning process -- cost adaptation -- which generates the optimization objective for D-RHC to solve based on a set of human-generated priors (cost and constraint functions  ...  One method for coordination of a swarm is via decentralized receding horizon control (D-RHC) [4] .  ... 
arXiv:1709.07114v2 fatcat:llin4qgkxjeinpk2ojern4pykq

Learning Mixed Strategies in Trajectory Games [article]

Lasse Peters, David Fridovich-Keil, Laura Ferranti, Cyrill Stachniss, Javier Alonso-Mora, Forrest Laine
2022 arXiv   pre-print
First, we introduce an offline training phase which reduces the online computational burden of solving trajectory games.  ...  In multi-agent settings, game theory is a natural framework for describing the strategic interactions of agents whose objectives depend upon one another's behavior.  ...  Finally, we showcase our approach in online learning, where each player solves lifted trajectory games in a receding time horizon.  ... 
arXiv:2205.00291v2 fatcat:wwchxkscjjfndaabwlba4qjsfe

Decentralized Role Assignment in Multi-Agent Teams via Empirical Game-Theoretic Analysis [article]

Fengjun Yang, Negar Mehr, Mac Schwager
2021 arXiv   pre-print
Based on this game-theoretic formulation, we propose a distributed controller for each robot to dynamically decide on the best role to take.  ...  We demonstrate our method in simulations of a collaborative planar manipulation scenario in which each agent chooses from a set of feedback control policies at each instant.  ...  This method of applying EGTA in a receding-horizon fashion is by no means limited to finding role assignments.  ... 
arXiv:2109.14755v1 fatcat:tvwpcjmahbd2pavfcwkr5b6fwa

Online Resource Provisioning for Wireless Data Collection

Yu Liu, Joshua Comden, Zhenhua Liu, Yuanyuan Yang
2022 ACM transactions on sensor networks  
Specifically, we design separate online algorithms for systems in which the state evolves in either a stationary manner or an arbitrarily determined manner and prove their performance bounds where their  ...  Additionally, we design a meta-algorithm that can choose which online algorithm to implement at each point in time, depending on the recent behavior of the system environment.  ...  We also tested two online algorithms, Receding Horizon Control (RHC) and Fixed Horizon Control (FHC) [22] , that use predictions but are not designed for online optimization problems with inventory constraints  ... 
doi:10.1145/3470648 fatcat:klo6x6tgpng55jjgss5zyash2y

Comparison of Deep Reinforcement Learning and Model Predictive Control for Adaptive Cruise Control [article]

Yuan Lin, John McPhee, Nasser L. Azad
2020 arXiv   pre-print
This study compares Deep Reinforcement Learning (DRL) and Model Predictive Control (MPC) for Adaptive Cruise Control (ACC) design in car-following scenarios.  ...  long prediction horizon.  ...  ACKNOWLEDGMENT The authors would like to thank Toyota, Ontario Centres of Excellence, and the Natural Sciences and Engineering Research Council of Canada for the support of this work.  ... 
arXiv:1910.12047v3 fatcat:qosqvikdjrgg7asfk5deivj6iu

Fault compensation by online updating of genetic algorithm-selected neural network model for model predictive control

Seong Hyeon Hong, Jackson Cornelius, Yi Wang, Kapil Pant
2019 SN Applied Sciences  
Through selective online updating of weight parameters, the online ANN is able to accurately capture the fault-induced variations in system dynamics, and can be used for MPC reconfiguration and fault compensation  ...  The dual-net model is comprised of an offline and an online artificial neural networks (ANNs) along with a switch that selects one of them for MPC.  ...  Eric Mark at the ARL for his support and feedback on the present work. Compliance with ethical standards  ... 
doi:10.1007/s42452-019-1526-9 fatcat:gffmmpuva5bcxjnayxsv2gdbfu

Adaptive Robust Model Predictive Control with Matched and Unmatched Uncertainty [article]

Rohan Sinha, James Harrison, Spencer M. Richards, Marco Pavone
2021 arXiv   pre-print
We propose a learning-based robust predictive control algorithm that compensates for significant uncertainty in the dynamics for a class of discrete-time systems that are nominally linear with an additive  ...  in the presence of uncertainties of large magnitude, a setting in which existing learning-based predictive control algorithms often struggle to guarantee safety.  ...  {rhnsinha, jharrison, spenrich, pavone}@stanford.edu ACKNOWLEDGEMENTS We thank Monimoy Bujarbaruah for his thoughtful comments on an early manuscript.  ... 
arXiv:2104.08261v3 fatcat:xryhcaxbzjfsdim7ik7hcoam6a

A Regret Minimization Approach to Iterative Learning Control [article]

Naman Agarwal, Elad Hazan, Anirudha Majumdar, Karan Singh
2021 arXiv   pre-print
We consider the setting of iterative learning control, or model-based policy learning in the presence of uncertain, time-varying dynamics.  ...  Based on recent advances in non-stochastic control, we design a new iterative algorithm for minimizing planning regret that is more robust to model mismatch and uncertainty.  ...  Meta-Learning. Our setting, analysis and, in particular, the nested OCO setup bears similarity to formulations for gradient-based meta-learning (see [FAL17] and references therein).  ... 
arXiv:2102.13478v1 fatcat:a6wtiktq7nf5takiyzpmtzozra

Analyzing the Improvements of Energy Management Systems for Hybrid Electric Vehicles Using a Systematic Literature Review: How Far Are These Controls from Rule-Based Controls Used in Commercial Vehicles?

Juan P. Torreglosa, Pablo Garcia-Triviño, David Vera, Diego A. López-García
2020 Applied Sciences  
This work presents a systematic literature review (SLR) of the more recent works that developed EMSs for HEVs.  ...  To take advantage of the emission reduction potential of hybrid electric vehicles (HEVs), the appropriate design of their energy management systems (EMSs) to control the power flow between the engine and  ...  Machine learning logic employed as online controller for multimode HEVs was proposed in [38] .  ... 
doi:10.3390/app10238744 fatcat:jacrh6igaffm3etle65xbs6pnm

Smoothed Online Convex Optimization in High Dimensions via Online Balanced Descent [article]

Niangjun Chen, Gautam Goel, Adam Wierman
2018 arXiv   pre-print
We study Smoothed Online Convex Optimization, a version of online convex optimization where the learner incurs a penalty for changing her actions between rounds.  ...  We demonstrate the generality of the OBD framework by showing how, with different choices of "balance," OBD can improve upon state-of-the-art performance guarantees for both competitive ratio and regret  ...  Currently, the only unifying frameworks for SOCO rely on the use of predictions, and use approaches based on receding horizon control, e.g., Chen et al. (2015) ; Badiei et al. (2015) ; Chen et al. (  ... 
arXiv:1803.10366v2 fatcat:plqeahnygfhcbfhi3yeq6j2yom

Combining Task and Motion Planning: Challenges and Guidelines

Masoumeh Mansouri, Federico Pecora, Peter Schüller
2021 Frontiers in Robotics and AI  
By doing so, this article aims to provide a guideline for designing combined TAMP solutions that are adequate and effective in the target scenario.  ...  All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.  ...  When planning for motions, this is known as Receding Horizon Control, or Model Predictive Control.  ... 
doi:10.3389/frobt.2021.637888 pmid:34095239 pmcid:PMC8170405 fatcat:4dmykcbhezbktfjzafv7t7wjfa

Reinforcement Learning: A Survey

L. P. Kaelbling, M. L. Littman, A. W. Moore
1996 The Journal of Artificial Intelligence Research  
It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.  ...  Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment.  ...  Acknowledgements Thanks to Marco Dorigo and three anonymous reviewers for comments that have helped to improve this paper.  ... 
doi:10.1613/jair.301 fatcat:nbo23vmu6rfz3ctpjbk7sdcnt4

Reinforcement Learning: A Survey [article]

L. P. Kaelbling, M. L. Littman, A. W. Moore
1996 arXiv   pre-print
It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.  ...  Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment.  ...  Acknowledgements Thanks to Marco Dorigo and three anonymous reviewers for comments that have helped to improve this paper.  ... 
arXiv:cs/9605103v1 fatcat:ze737h6wnfdhjf52hiz4gpogxq

Revisiting Smoothed Online Learning [article]

Lijun Zhang, Wei Jiang, Shiyin Lu, Tianbao Yang
2021 arXiv   pre-print
In this paper, we revisit the problem of smoothed online learning, in which the online learner suffers both a hitting cost and a switching cost, and target two performance metrics: competitive ratio and  ...  Furthermore, if the hitting cost is accessible in the beginning of each round, we obtain a similar guarantee without the bounded gradient condition, and establish an Ω(√(T(1+P_T))) lower bound to confirm  ...  Acknowledgments The authors would like to thank Yuxuan Xiang for discussions about Theorem 4.  ... 
arXiv:2102.06933v3 fatcat:pxxiyq4zurenzaxdfsbeigjn3e
« Previous Showing results 1 — 15 out of 599 results