Filters








80,129 Hits in 6.5 sec

Towards Safe Reinforcement-Learning in Industrial Grid-Warehousing

Per-Arne Andersen, Morten Goodwin, Ole-Christoffer Granmo
<span title="">2020</span> <i title="Elsevier BV"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/ozlq63ehnjeqxf6cuxxn27cqra" style="color: black;">Information Sciences</a> </i> &nbsp;
On the other hand, model-based reinforcement learning tries to encode environment transition dynamics into a predictive model.  ...  in non-deterministic and even deterministic, for fast-changing environments. (3) Conventional model-free exploration methods are not safe in mission-critical environments. (4) Reinforcement learning methods  ...  The Dreaming Variational Autoencoder (DVAE) is a model-based reinforcement learning approach for safe and efficient learning.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.ins.2020.06.010">doi:10.1016/j.ins.2020.06.010</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/kzm4owdbzfhvvnwv653dvxm47u">fatcat:kzm4owdbzfhvvnwv653dvxm47u</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210428005712/https://uia.brage.unit.no/uia-xmlui/bitstream/handle/11250/2711224/Andersen.pdf?sequence=4" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/cf/16/cf16a8879d69f9cbb74441371c8d373bc5056a0f.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.ins.2020.06.010"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> elsevier.com </button> </a>

Learning-Based Model Predictive Control for Safe Exploration

Torsten Koller, Felix Berkenkamp, Matteo Turchetta, Andreas Krause
<span title="">2018</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/wjd7b2sxyfahnaei4xvi46vwsu" style="color: black;">2018 IEEE Conference on Decision and Control (CDC)</a> </i> &nbsp;
We combine a provably safe learning-based MPC scheme that allows for input-dependent uncertainties with techniques from model-based RL to solve tasks with only limited prior knowledge.  ...  In this paper, we attempt to bridge the gap between learning-based techniques that are scalable and highly autonomous but often unsafe and robust control techniques, which have a solid theoretical foundation  ...  Reinforcement learning objective and MPC scheme We require an objective function that jointly encourages exploration and finding a good control strategy based on our current statistical model.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/cdc.2018.8619572">doi:10.1109/cdc.2018.8619572</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/cdc/KollerBT018.html">dblp:conf/cdc/KollerBT018</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/omofexgb6vbzrnuluspinjmnmu">fatcat:omofexgb6vbzrnuluspinjmnmu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200318164646/https://openreview.net/pdf?id=ryea4KbjpE" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/3e/e6/3ee65310005d0066ed6a3b5927efb6665e5bade5.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/cdc.2018.8619572"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

LS3: Latent Space Safe Sets for Long-Horizon Visuomotor Control of Sparse Reward Iterative Tasks [article]

Albert Wilcox and Ashwin Balakrishna and Brijen Thananjeyan and Joseph E. Gonzalez and Ken Goldberg
<span title="2021-09-21">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Reinforcement learning (RL) has shown impressive success in exploring high-dimensional environments to learn complex tasks, but can often exhibit unsafe behaviors and require extensive environment interaction  ...  A promising strategy for learning in dynamically uncertain environments is requiring that the agent can robustly return to learned safe sets, where task success (and therefore safety) can be guaranteed  ...  These methods learn predictive models over either images or a learned latent space, which are then used by model predictive control (MPC) to optimize image-based task costs.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2107.04775v2">arXiv:2107.04775v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/kmhxtsvfu5hojdrwff2qk6x7he">fatcat:kmhxtsvfu5hojdrwff2qk6x7he</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210716212016/https://arxiv.org/pdf/2107.04775v1.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/d1/01/d1017bc28f6809d271a49572d7e1fc57dccb0ab4.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2107.04775v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Improving Safety in Reinforcement Learning Using Model-Based Architectures and Human Intervention [article]

Bharat Prakash, Mohit Khatwani, Nicholas Waytowich, Tinoosh Mohsenin
<span title="2019-03-22">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We present a hybrid method for reducing the human intervention time by combining model-based approaches and training a supervised learner to improve sample efficiency while also ensuring safety.  ...  Recent progress in AI and Reinforcement learning has shown great success in solving complex problems with high dimensional state spaces.  ...  Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1903.09328v1">arXiv:1903.09328v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/b2zsyziyprc7rgiukw4durauzu">fatcat:b2zsyziyprc7rgiukw4durauzu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200824214745/https://arxiv.org/pdf/1903.09328v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/f1/34/f1341dcb7c1a4fc3a8fa3a77e2da3c2b2c9dfab8.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1903.09328v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning [article]

Lukas Brunke, Melissa Greeff, Adam W. Hall, Zhaocong Yuan, Siqi Zhou, Jacopo Panerati, Angela P. Schoellig
<span title="2021-12-06">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
The last half-decade has seen a steep rise in the number of contributions on safe learning methods for real-world robotic deployments from both the control and reinforcement learning communities.  ...  Our review includes: learning-based control approaches that safely improve performance by learning the uncertain dynamics, reinforcement learning approaches that encourage safety or robustness, and methods  ...  support from the Natural Sciences and Engineering Research Council of Canada (NSERC), the Canada Research Chairs Program, and the CIFAR AI Chair.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2108.06266v2">arXiv:2108.06266v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/gbbe3qyatfgelgzhqzglecr5qm">fatcat:gbbe3qyatfgelgzhqzglecr5qm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210826134946/https://arxiv.org/pdf/2108.06266v1.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/68/29/6829c6882093d9ddedba926377de2bfe01952fa7.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2108.06266v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Reinforcement learning based algorithm with Safety Handling and Risk Perception

Suhas Shyamsundar, Tommaso Mannucci, Erik-Jan van Kampen
<span title="">2016</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/rcrfzcjoeva4hmwbfmhfolnrfu" style="color: black;">2016 IEEE Symposium Series on Computational Intelligence (SSCI)</a> </i> &nbsp;
This paper presents the setup and the results of a reinforcement learning problem utilizing Q-learning and a Safety Handling Exploration with Risk Perception Algorithm (SHERPA) for safe exploration in  ...  The agent has to explore its environment safely and must learn the optimal action for a given situation from the feedback received from the environment.  ...  This paper presents the setup and the results of a reinforcement learning problem utilizing Q-learning and a Safety Handling Exploration with Risk Perception Algorithm (SHERPA) for safe exploration in  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ssci.2016.7849367">doi:10.1109/ssci.2016.7849367</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/ssci/ShyamsundarMK16.html">dblp:conf/ssci/ShyamsundarMK16</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/4rewatc4hvchlggqlxggy4dq7e">fatcat:4rewatc4hvchlggqlxggy4dq7e</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20180722213448/https://repository.tudelft.nl/islandora/object/uuid%3A365b004d-74e8-47d5-9696-ee86f4cb37e4/datastream/OBJ/download" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/13/a4/13a4db51a0e78930b416ec96279d408cb28864a2.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ssci.2016.7849367"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Exploration in deep reinforcement learning: A survey

Pawel Ladosz, Lilian Weng, Minwoo Kim, Hyondong Oh
<span title="">2022</span> <i title="Elsevier BV"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/u3qqmkiofjejrnpdxh3hdgssm4" style="color: black;">Information Fusion</a> </i> &nbsp;
In such a scenario, it is challenging for reinforcement learning to learn rewards and actions association. Thus more sophisticated exploration methods need to be devised.  ...  methods, probabilistic methods, imitation-based methods, safe exploration and random-based methods.  ...  Exploration in Reinforcement Learning Exploration in reinforcement learning can be split into two main streams: efficiency and safe exploration.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.inffus.2022.03.003">doi:10.1016/j.inffus.2022.03.003</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/q4uwqd26qjfyzivtyzqhf7u5cm">fatcat:q4uwqd26qjfyzivtyzqhf7u5cm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220504070333/https://arxiv.org/pdf/2205.00824v1.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/4f/cc/4fccc0f529c2c83788a9bf84f6cabe3f53216c4a.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.inffus.2022.03.003"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> elsevier.com </button> </a>

Uncertainty-Aware Reinforcement Learning for Collision Avoidance [article]

Gregory Kahn, Adam Villaflor, Vitchyr Pong, Pieter Abbeel, Sergey Levine
<span title="2017-02-03">2017</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Reinforcement learning can enable complex, adaptive behavior to be learned automatically for autonomous robotic platforms.  ...  Our predictive model is based on bootstrapped neural networks using dropout, allowing it to process raw sensory inputs from high-bandwidth sensors such as cameras.  ...  cost, and exploring this extension to general reinforcement learning problems could produce effective and safe exploration techniques for a wide range of robotic scenarios. VII.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1702.01182v1">arXiv:1702.01182v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/buefsmv7obex3onfa4ntotksvm">fatcat:buefsmv7obex3onfa4ntotksvm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200930151639/https://arxiv.org/pdf/1702.01182v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/f2/c2/f2c20cb6ebd2ad704c5bcae4eb8b942d3c62f8e0.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1702.01182v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Risk Sensitive Model-Based Reinforcement Learning using Uncertainty Guided Planning [article]

Stefan Radic Webster, Peter Flach
<span title="2021-11-09">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Identifying uncertainty and taking mitigating actions is crucial for safe and trustworthy reinforcement learning agents, especially when deployed in high-risk environments.  ...  In this paper, risk sensitivity is promoted in a model-based reinforcement learning algorithm by exploiting the ability of a bootstrap ensemble of dynamics models to estimate environment epistemic uncertainty  ...  Acknowledgments and Disclosure of Funding We would like to thank Tom Bewley and Jonathan Thomas for their useful discussions and feedback while conceptualising this paper.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2111.04972v1">arXiv:2111.04972v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/4d2y4uqm5zcl7eelnddkrztx3m">fatcat:4d2y4uqm5zcl7eelnddkrztx3m</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20211114133720/https://arxiv.org/pdf/2111.04972v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/e3/df/e3dfa91f3ac6aad3fb66f0cead401dcf1ebb9076.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2111.04972v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Safe Reinforcement Learning with Mixture Density Network: A Case Study in Autonomous Highway Driving [article]

Ali Baheri
<span title="2020-11-17">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
This paper presents a safe reinforcement learning system for automated driving that benefits from multimodal future trajectory predictions.  ...  We propose a safety system that consists of two safety components: a heuristic safety and a learning-based safety. The heuristic safety module is based on common driving rules.  ...  This model was served as a model lookahead to accelerate the learning process and guid the exploration process.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2007.01698v3">arXiv:2007.01698v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/g6zxygzyenf7rjdlbr26x2fnbu">fatcat:g6zxygzyenf7rjdlbr26x2fnbu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201119071553/https://arxiv.org/pdf/2007.01698v3.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/4c/c2/4cc2cab8b5ca3f849cb3db14bb2fb03c5ac0b6dc.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2007.01698v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Deep Reinforcement Learning with Enhanced Safety for Autonomous Highway Driving [article]

Ali Baheri, Subramanya Nageshrao, H. Eric Tseng, Ilya Kolmanovsky, Anouck Girard, Dimitar Filev
<span title="2020-04-23">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
In this paper, we present a safe deep reinforcement learning system for automated driving.  ...  The proposed framework leverages merits of both rule-based and learning-based approaches for safety assurance.  ...  The rule-based module, called handcrafted safety, is based on common driving practice, while the learning-based module serves as the model lookahead to predict safety longer into the future.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1910.12905v2">arXiv:1910.12905v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/okow4jbihfdyjokdqbule6qkvy">fatcat:okow4jbihfdyjokdqbule6qkvy</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200504144801/https://arxiv.org/pdf/1910.12905v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1910.12905v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Parallel reward and punishment control in humans and robots: Safe reinforcement learning using the MaxPain algorithm

Stefan Elfwing, Ben Seymour
<span title="">2017</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/6uxdcjybibcyxgl23nvffdmliq" style="color: black;">2017 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob)</a> </i> &nbsp;
An important issue in reinforcement learning systems for autonomous agents is whether it makes sense to have separate systems for predicting rewards and punishments.  ...  In robotics, learning and control are typically achieved by a single controller, with punishments coded as negative rewards.  ...  BS is supported by the Wellcome Trust (UK) and the National Institute for Information and Communications Technology (Japan).  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/devlrn.2017.8329799">doi:10.1109/devlrn.2017.8329799</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/icdl-epirob/ElfwingS17.html">dblp:conf/icdl-epirob/ElfwingS17</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/7zw6ermorzgvpj46jjsfhopy3e">fatcat:7zw6ermorzgvpj46jjsfhopy3e</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200507191247/https://www.repository.cam.ac.uk/bitstream/handle/1810/286048/idcl-maxpain%20(1).pdf;jsessionid=35F96C3EE7C28C1AE6A9DA88EFCFF11B?sequence=1" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/1d/6f/1d6f4d5494aee5aa9e955823879a6c1dc931faae.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/devlrn.2017.8329799"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

safe-control-gym: a Unified Benchmark Suite for Safe Learning-based Control and Reinforcement Learning [article]

Zhaocong Yuan, Adam W. Hall, Siqi Zhou, Lukas Brunke, Melissa Greeff, Jacopo Panerati, Angela P. Schoellig
<span title="2022-02-25">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Here, we propose a new open-source benchmark suite, called safe-control-gym, supporting both model-based and data-based control techniques.  ...  In recent years, both reinforcement learning and learning-based control -- as well as the study of their safety, which is crucial for deployment in real-world robots -- have gained significant traction  ...  of safe learning-based control and reinforcement learning.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2109.06325v3">arXiv:2109.06325v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/q7dpbgtog5aqlgocuom2kdbwvq">fatcat:q7dpbgtog5aqlgocuom2kdbwvq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220519122404/https://arxiv.org/pdf/2109.06325v3.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/90/40/9040233aedc13a38faaa92f9e9c297dead38e3c7.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2109.06325v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Safe Reinforcement Learning with Chance-constrained Model Predictive Control [article]

Samuel Pfrommer, Tanmay Gautam, Alec Zhou, Somayeh Sojoudi
<span title="2022-03-28">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We address the challenge of safe RL by coupling a safety guide based on model predictive control (MPC) with a modified policy gradient framework in a linear setting with continuous actions.  ...  We show theoretically that this penalty allows for a provably safe optimal base policy and illustrate our method with a simulated linearized quadrotor experiment.  ...  Model Predictive Control Model predictive control is a purely optimization-based planning framework.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2112.13941v2">arXiv:2112.13941v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/3rbp3kk7iveefkjpvtzs4thj7u">fatcat:3rbp3kk7iveefkjpvtzs4thj7u</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220630111115/https://arxiv.org/pdf/2112.13941v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/84/bd/84bd784a80037c2bb08e32c20c8642daa5de728c.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2112.13941v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Modeling Survival in model-based Reinforcement Learning [article]

Saeed Moazami, Peggy Doerschuk
<span title="2020-04-18">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
In this regard, model-based reinforcement learning proposes some remedies. Yet, inherently, model-based methods are more computationally expensive and susceptible to sub-optimality.  ...  To that end, a substitute model for the reward function approximator is introduced that learns to avoid terminal states rather than to maximize accumulated rewards from safe states.  ...  Most of the recent model-based reinforcement learning methods use artificial neural networks as predictive models for the environment's transition and reward function approximators.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2004.08648v1">arXiv:2004.08648v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/lftxigyzubejtptsw6eljcnlym">fatcat:lftxigyzubejtptsw6eljcnlym</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200428112824/https://arxiv.org/pdf/2004.08648v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2004.08648v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 80,129 results