Filters








43,204 Hits in 4.1 sec

Verifiably Safe Off-Model Reinforcement Learning [chapter]

Nathan Fulton, André Platzer
2019 Lecture Notes in Computer Science  
This paper introduces verification-preserving model updates, the first approach toward obtaining formal safety guarantees for reinforcement learning in settings where multiple possible environmental models  ...  Acting well given an accurate environmental model is an important pre-requisite for safe learning, but is ultimately insufficient for systems that operate in complex heterogeneous environments.  ...  We call this problem verifiably safe off-model learning. In this paper we introduce a first approach toward obtaining formal safety proofs for off-model learning.  ... 
doi:10.1007/978-3-030-17462-0_28 fatcat:h7tbnexlfrbl5lsc223tnjqary

Verifiably Safe Off-Model Reinforcement Learning [article]

Nathan Fulton, Andre Platzer
2019 arXiv   pre-print
This paper introduces verification-preserving model updates, the first approach toward obtaining formal safety guarantees for reinforcement learning in settings where multiple environmental models must  ...  Acting well given an accurate environmental model is an important pre-requisite for safe learning, but is ultimately insufficient for systems that operate in complex heterogeneous environments.  ...  Our contributions enabling verifiably safe off-model learning include: 1.  ... 
arXiv:1902.05632v1 fatcat:b3celfznhfapfcr6r4zub6t75q

Safe Reinforcement Learning via Formal Methods: Toward Safe Control Through Proof and Learning

Nathan Fulton, André Platzer
2018 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
by reinforcement learning.  ...  Verification results are preserved whenever learning agents limit exploration within the confounds of verified control choices as long as observed reality comports with the model used for off-line verification  ...  Justified Speculative Control combines verified runtime monitoring -backed by formally verified models -with reinforcement learning.  ... 
doi:10.1609/aaai.v32i1.12107 fatcat:ihbimfppzbakbna25vrmo6rcxu

Verified Probabilistic Policies for Deep Reinforcement Learning [article]

Edoardo Bacci, David Parker
2022 arXiv   pre-print
In this paper, we tackle the problem of verifying probabilistic policies for deep reinforcement learning, which are used to, for example, tackle adversarial environments, break symmetries and manage trade-offs  ...  There is also growing interest in formally verifying that such policies are correct and execute safely.  ...  We show that our approach successfully verifies probabilistic policies trained for several reinforcement learning benchmarks and explore trade-offs in precision and computational efficiency.  ... 
arXiv:2201.03698v2 fatcat:dix7xgasfrewbjrqdygzgf65fu

Runtime Safety Assurance Using Reinforcement Learning [article]

Christopher Lazarus, James G. Lopez, Mykel J. Kochenderfer
2020 arXiv   pre-print
We frame the design of RTSA with the Markov decision process (MDP) framework and use reinforcement learning (RL) to solve it.  ...  When the system is triggered, a verified recovery controller is deployed.  ...  From these episodes, we learn the parameters in an offline approach known as batch reinforcement learning.  ... 
arXiv:2010.10618v1 fatcat:ndx4lwhwtreylhmgbjrbh5mp2a

Reinforcement Learning with Probabilistic Guarantees for Autonomous Driving [article]

Maxime Bouton, Jesper Karlsson, Alireza Nakhaei, Kikuo Fujimura, Mykel J. Kochenderfer, Jana Tumova
2019 arXiv   pre-print
Reinforcement learning (RL) has been used to automatically derive suitable behavior in uncertain environments, but it does not provide any guarantee on the performance of the resulting policy.  ...  Reinforcement Learning In reinforcement learning (RL), the environment is modeled as an MDP with unknown transition and reward models (Kaelbling et al., 1996) .  ...  A reinforcement learning algorithm to derive provably safe policies has been demonstrated for hybrid systems in (Fulton et al., 2018) .  ... 
arXiv:1904.07189v2 fatcat:fxsjjitdeven3a56jn6oxxohbm

Learning-Based Model Predictive Control for Safe Exploration

Torsten Koller, Felix Berkenkamp, Matteo Turchetta, Andreas Krause
2018 2018 IEEE Conference on Decision and Control (CDC)  
We combine a provably safe learning-based MPC scheme that allows for input-dependent uncertainties with techniques from model-based RL to solve tasks with only limited prior knowledge.  ...  Reinforcement learning has been successfully used to solve difficult tasks in complex unknown environments.  ...  SAFE REINFORCEMENT LEARNING We design a MPC scheme that can solve a given RL task under safety constraints.  ... 
doi:10.1109/cdc.2018.8619572 dblp:conf/cdc/KollerBT018 fatcat:omofexgb6vbzrnuluspinjmnmu

Safe adaptation in multiagent competition [article]

Macheng Shen, Jonathan P. How
2022 arXiv   pre-print
To overcome this difficulty, we developed a safe adaptation approach in which the ego-agent is trained against a regularized opponent model, which effectively avoids overfitting and consequently improves  ...  In multiagent competitive scenarios, agents may have to adapt to new opponents with previously unseen behaviors by learning from the interaction experiences between the ego-agent and the opponent.  ...  reinforcement learning setting.  ... 
arXiv:2203.07562v1 fatcat:jc23hg567jerboxajlqcaz74pu

Verifiably Safe Exploration for End-to-End Reinforcement Learning [article]

Nathan Hunt, Nathan Fulton, Sara Magliacane, Nghia Hoang, Subhro Das, Armando Solar-Lezama
2020 arXiv   pre-print
Deploying deep reinforcement learning in safety-critical settings requires developing algorithms that obey hard constraints during exploration.  ...  Our benchmark draws from several proposed problem sets for safe learning and includes problems that emphasize challenges such as reward signals that are not aligned with safety constraints.  ...  Supplementary material for: Verifiably Safe Exploration for End-to-End Reinforcement Learning A Model Monitoring We use differential Dynamic Logic (dL) (Platzer, 2008 (Platzer, , 2010 (Platzer, , 2012  ... 
arXiv:2007.01223v1 fatcat:shyzxdco5jhzlfwpefvf2ejnly

Do Androids Dream of Electric Fences? Safety-Aware Reinforcement Learning with Latent Shielding [article]

Peter He, Borja G. Leon, Francesco Belardinelli
2021 arXiv   pre-print
In recent years, a variety of approaches have been put forward to address the challenges of safety-aware reinforcement learning; however, these methods often either require a handcrafted model of the environment  ...  We present a novel approach to safety-aware deep reinforcement learning in high-dimensional environments called latent shielding.  ...  Symbolic Reinforcement Learning for Safe RAN  ... 
arXiv:2112.11490v1 fatcat:o5curea7ena5blafok3uldvfni

Safe Q-Learning Method Based on Constrained Markov Decision Processes

Yangyang Ge, Fei Zhu, Xinghong Ling, Quan Liu
2019 IEEE Access  
The experiments verify the effectiveness of the algorithm. INDEX TERMS Constrained Markov decision processes, safe reinforcement learning, Q-learning, constraint, Lagrange multiplier.  ...  In order to solve the aforementioned problem, we come forward with a safe Q-learning method that is based on constrained Markov decision processes, adding safety constraints as prerequisites to the model  ...  Safe Q-learning, however, can solve model-free problem for Q-learning is an efficient model-free reinforcement algorithm.  ... 
doi:10.1109/access.2019.2952651 fatcat:wzezmhbadbabtjoy47vxpjgqdu

Safe Model-based Off-policy Reinforcement Learning for Eco-Driving in Connected and Automated Hybrid Electric Vehicles [article]

Zhaoxuan Zhu, Nicola Pivaro, Shobhit Gupta, Abhishek Gupta, Marcello Canova
2022 arXiv   pre-print
While the previous studies synthesize simulators and model-free DRL to reduce online computation, this work proposes a Safe Off-policy Model-Based Reinforcement Learning algorithm for the eco-driving problem  ...  First, the combination of off-policy learning and the use of a physics-based model improves the sample efficiency.  ...  The second contribution of this work, from the RL algorithm perspective, is the development of Safe Model-based Off-policy Reinforcement Learning (SMORL), a safe-critical model-based off-policy Q-learning  ... 
arXiv:2105.11640v2 fatcat:6nj7e5vej5b3rkfp4snjk7nt4e

Fuzzy controller, designed by reinforcement learning, for vehicle traction system application

L. I. Demkiv, Lviv Polytechnic National University, A. O. Lozynskyy, V. V. Vantsevich, D. J. Gorsich, V. V. Lytvyn, S. R. Klos, M. D. Letherwood, Lviv Polytechnic National University, University of Alabama at Birmingham, US Army CCDC Ground Vehicle Systems Center, Warren, MI, Lviv Polytechnic National University (+2 others)
2021 Mathematical Modeling and Computing  
To verify the performance of the proposed controller, the adaptive fuzzy controller tuned by reinforcement learning is applied to the mathematical model of a wheel locomotion module of an electric vehicle  ...  Thus, the designed fuzzy control that is tuned by reinforcement learning is capable to ensure the stable, optimal, and safe performance of the system and takes into account external disturbances.  ...  Background Reinforcement learning: Q-learning algorithm Reinforcement learning is a subfield of machine learning.  ... 
doi:10.23939/mmc2021.02.168 fatcat:j52xso7ejja5rktkmq3po7gpte

A Review of Safe Reinforcement Learning: Methods, Theory and Applications [article]

Shangding Gu, Long Yang, Yali Du, Guang Chen, Florian Walter, Jun Wang, Yaodong Yang, Alois Knoll
2022 arXiv   pre-print
//github.com/chauncygu/Safe-Reinforcement-Learning-Baselines.git.  ...  Reinforcement learning (RL) has achieved tremendous success in many complex decision making tasks.  ...  Model-Based Safe Reinforcement Learning Linear program and Lagrangian approximation are widely used in modelbased safe reinforcement learning if the estimated transition model is either given or estimated  ... 
arXiv:2205.10330v3 fatcat:2xcflmxcffeejdsrcumnvx6x2e

Near Optimal Control With Reachability and Safety Guarantees

Cees F. Verdier, Robert Babuška, Barys Shyrokau, Manuel Mazo
2019 IFAC-PapersOnLine  
We propose a method to construct a near-optimal control law by means of model-based reinforcement learning and subsequently verifying the reachability and safety of the closed-loop control system through  ...  We propose a method to construct a near-optimal control law by means of model-based reinforcement learning and subsequently verifying the reachability and safety of the closed-loop control system through  ...  We propose a method to construct a near-optimal control law by means of model-based reinforcement learning and subsequently verifying the reachability and safety of the closed-loop control system through  ... 
doi:10.1016/j.ifacol.2019.09.146 fatcat:465axkqg5bb6hhvxb7xmyp2ik4
« Previous Showing results 1 — 15 out of 43,204 results