3,670 Hits in 3.4 sec

Deep Reinforcement Learning with Linear Quadratic Regulator Regions [article]

Gabriel I. Fernandez, Colin Togashi, Dennis W. Hong, Lin F. Yang
2020 arXiv   pre-print
The modified neural networks not only capture the nonlinearities of the system but also provably preserve linearity in a certain region of the state space and thus can be tuned to resemble a linear quadratic  ...  Practitioners often rely on compute-intensive domain randomization to ensure reinforcement learning policies trained in simulation can robustly transfer to the real world.  ...  Within this linear region, we fit the network with a pre-existing linear controller, e.g., a linear optimal controller: linear quadratic regulator (LQR), by either adding a regularizer or modifying the  ... 
arXiv:2002.09820v2 fatcat:77ec6enmjbev3lpn2t3b5czry4

Guest Editorial Introduction to the Special Issue of the IEEE L-CSS on Learning and Control

Giovanni Cherubini, Martin Guay, Sophie Tarbouriech, Kartik Ariyur, Mireille E. Broucke, Subhrakanti Dey, Christian Ebenbauer, Paolo Frasca, Bahman Gharesifard, Antoine Girard, Joao Manoel Gomes da Silva, Lars Grune (+5 others)
2020 IEEE Control Systems Letters  
Adaptive Control of Linear Quadratic Regulators: "Regret Lower Bounds for Unbiased Adaptive Control of Linear Quadratic Regulators": Ziemann and Sandberg present lower bounds for the regret of adaptive  ...  "Chance-Constrained Control With Lexicographic Deep Reinforcement Learning": Giuseppi and Pietrabissa introduce a lexicographic approach to deep reinforcement learning for chance-constrained control, where  ... 
doi:10.1109/lcsys.2020.2986590 fatcat:wx42r4h6ond3dkjntcwsdrmojy

Optimal control and learning for cyber‐physical systems

Yan Wan, Tao Yang, Ye Yuan, Frank L. Lewis
2021 International Journal of Robust and Nonlinear Control  
Addressing these challenges requires a seamless integration of the optimal control theory with advances from learning and other science and engineering domains.  ...  The papers received span broad topics including learning and data-driven optimal control to address physical unknowns and disturbances, estimation techniques to deal with uncertainties; secure and resilient  ...  The solution realizes linear quadratic regulation (LQR) control without any knowledge of the system parameters.  ... 
doi:10.1002/rnc.5442 fatcat:2sqn5j3urrgcrbjxfx6vsgvnci

Improving the Exploration of Deep Reinforcement Learning in Continuous Domains using Planning for Policy Search [article]

Jakob J. Hollenstein, Erwan Renaudo, Matteo Saveriano, Justus Piater
2020 arXiv   pre-print
Local policy search is performed by most Deep Reinforcement Learning (D-RL) methods, which increases the risk of getting trapped in a local minimum.  ...  We call the resulting model-based reinforcement learning method PPS (Planning for Policy Search).  ...  The steering approach based on a linear MPC with shrinking horizon is summarized in Algorithm 4.  ... 
arXiv:2010.12974v1 fatcat:7yxryhlhifdnfahjobiy35z7zi

Deep Deterministic Portfolio Optimization [article]

Ayman Chaouki, Stephen Hardiman, Christian Schmidt, Emmanuel Sérié, Joachim de Lataillade
2020 arXiv   pre-print
Can deep reinforcement learning algorithms be exploited as solvers for optimal trading strategies?  ...  We study the deep deterministic policy gradient algorithm and show that such a reinforcement learning agent can successfully recover the essential features of the optimal trading strategies and achieve  ...  Introduction The fusion of reinforcement learning (RL) with deep learning techniques, aka. deep RL (dRL), has experienced an astonishing increase in popularity over the last years [1] .  ... 
arXiv:2003.06497v2 fatcat:7jdrosubonh4xjinmk5djnx6pe

Path Integral Networks: End-to-End Differentiable Optimal Control [article]

Masashi Okada, Luca Rigazio, Takenobu Aoshima
2017 arXiv   pre-print
and reinforcement learning.  ...  Preliminary experiment results show that PI-Net, trained by imitation learning, can mimic control demonstrations for two simulated problems; a linear system and a pendulum swing-up problem.  ...  Generally used optimal controller, linear quadratic regulator (LQR), is also differentiable and Ref.  ... 
arXiv:1706.09597v1 fatcat:jbdgo2k2nbg45mel6x4fprpsky

RLOC: Neurobiologically Inspired Hierarchical Reinforcement Learning Algorithm for Continuous Control of Nonlinear Dynamical Systems [article]

Ekaterina Abramova, Luke Dickens, Daniel Kuhn, Aldo Faisal
2019 arXiv   pre-print
Our framework learns the local task dynamics from naive experience and forms locally optimal infinite horizon Linear Quadratic Regulators which produce continuous low-level control.  ...  We show that a small number of locally optimal linear controllers are able to solve global nonlinear control problems with unknown dynamics when combined with a reinforcement learner in this hierarchical  ...  Frameworks Similar to RLOC A number of frameworks combing reinforcement learning with linear quadratic regulators (LQR) exist in the literature.  ... 
arXiv:1903.03064v1 fatcat:tdbj3pn4z5f5dkzfby3phq3p6u

Tailored neural networks for learning optimal value functions in MPC [article]

Dieter Teichrib, Moritz Schulze Darup
2021 arXiv   pre-print
In this paper, we provide a similar result for representing the optimal value function and the Q-function that are both known to be piecewise quadratic for linear MPC.  ...  Learning-based predictive control is a promising alternative to optimization-based MPC.  ...  The explicit way that it is either active or inactive on the individual regions linear quadratic regulator for constrained systems. Automatica, 38(1):3– R(i) .  ... 
arXiv:2112.03975v1 fatcat:wexhgbh2kvcjvitbqfwiadvbnq

DEALIO: Data-Efficient Adversarial Learning for Imitation from Observation [article]

Faraz Torabi, Garrett Warnell, Peter Stone
2021 arXiv   pre-print
Specifically, we consider time-varying linear Gaussian policies, and propose a method that integrates the linear-quadratic regulator with path integral policy improvement into an existing adversarial IfO  ...  , model-free reinforcement learning algorithms.  ...  The goal of reinforcement learning agents is to use their own experience to learn a policy that results in a behavior that incurs minimal expected cumulative cost. 1) The Linear-Quadratic Regulator (LQR  ... 
arXiv:2104.00163v1 fatcat:z4vjlyf7snbhznd2ro5z7wfyyu

Stability-certified reinforcement learning: A control-theoretic perspective [article]

Ming Jin, Javad Lavaei
2018 arXiv   pre-print
We investigate the important problem of certifying stability of reinforcement learning policies when interconnected with nonlinear dynamical systems.  ...  Empirical evaluations on two decentralized control tasks, namely multi-flight formation and power system frequency regulation, demonstrate that the reinforcement learning agents can have high performance  ...  Remarkable progress has been made in reinforcement learning (RL) using (deep) neural networks to solve complex decision-making and control problems [43] .  ... 
arXiv:1810.11505v1 fatcat:j3zokw3spradpln4rfylufdcoy

Deep Q-learning: a robust control approach [article]

Balázs Varga, Balázs Kulcsár, Morteza Haghir Chehreghani
2022 arXiv   pre-print
In this paper, we place deep Q-learning into a control-oriented perspective and study its learning dynamics with well-established techniques from robust control.  ...  Setting up the learning agent with a control-oriented tuning methodology is more transparent and has well-established literature compared to the heuristics in reinforcement learning.  ...  learning (RL) (Q-learning) converges to an optimal linear quadratic (LQ) regulator if the environment is a linear system.  ... 
arXiv:2201.08610v1 fatcat:l3ub6jhwtjdz3o5hyns6ef57te

Large-Scale Graph Reinforcement Learning in Wireless Control Systems [article]

Vinicius Lima, Mark Eisen, Konstantinos Gatsis, Alejandro Ribeiro
2022 arXiv   pre-print
Designing resource allocation policies, however, is challenging, motivating recent works to successfully exploit deep learning and deep reinforcement learning techniques to design resource allocation and  ...  As the number of learnable parameters in a neural network grows with the size of the input signal, deep reinforcement learning may fail to scale, limiting the immediate generalization of such scheduling  ...  [34] in the field of deep reinforcement learning.  ... 
arXiv:2201.09859v2 fatcat:42mramfrird6dd3iupdqemjmyq

Neural Lyapunov Control [article]

Ya-Chien Chang, Nima Roohi, Sicun Gao
2020 arXiv   pre-print
We propose new methods for learning control policies and neural network Lyapunov functions for nonlinear control problems, with provable guarantee of stability.  ...  The approach significantly simplifies the process of Lyapunov control design, provides end-to-end correctness guarantee, and can obtain much larger regions of attraction than existing methods such as LQR  ...  Linear-Quadratic Regulators (LQR) is a widely-adpoted optimal control strategy.  ... 
arXiv:2005.00611v3 fatcat:sbbzcgo4ejff3lhg4u6mk7vrle

Barrier Certified Safety Learning Control: When Sum-of-Square Programming Meets Reinforcement Learning [article]

Hejun Huang, Zhenglong Li, Dongkun Han
2022 arXiv   pre-print
Compared to quadratic programming based reinforcement learning methods, our sum-of-squares programming based reinforcement learning has shown its superiority.  ...  Reinforcement learning provides a useful way to strengthen safety. However, reinforcement learning algorithms cannot completely guarantee safety over realistic operations.  ...  CBF GUIDING CONTROL WITH REINFORCEMENT LEARNING In the first part of this section, we will further express the computation steps of (12) .  ... 
arXiv:2206.07915v2 fatcat:lavhgfjst5fmrennlvbjfzjcsy

Deep Reinforcement Learning Based Volt-VAR Optimization in Smart Distribution Systems [article]

Ying Zhang, Xinan Wang, Jianhui Wang, Yingchen Zhang
2020 arXiv   pre-print
This paper develops a model-free volt-VAR optimization (VVO) algorithm via multi-agent deep reinforcement learning (MADRL) in unbalanced distribution systems.  ...  A delicately designed reward function guides these agents to interact with the distribution system, in the direction of reinforcing voltage regulation and power loss reduction simultaneously.  ...  For instance, [8] integrates the branch-and-bound approach to the trust-region sequential quadratic programming to iteratively solve the VVO problem.  ... 
arXiv:2003.03681v2 fatcat:kaysoexl3rg7fjmpwdjwb5ki3q
« Previous Showing results 1 — 15 out of 3,670 results