24 Hits in 6.6 sec

Safe Option-Critic: Learning Safety in the Option-Critic Architecture [article]

Arushi Jain, Khimya Khetarpal, Doina Precup
2021 arXiv   pre-print
We propose an optimization objective that learns safe options by encouraging the agent to visit states with higher behavioural consistency.  ...  We consider a behaviour as safe that avoids regions of state-space with high uncertainty in the outcomes of actions.  ...  , Martin Klissarov, Kushal Arora, for constructive discussions throughout the duration of this work, and the anonymous reviewers for the feedback on earlier drafts of this manuscript.  ... 
arXiv:1807.08060v2 fatcat:vdyfvbdlbbfxnnd3c6rktdxwsq

Towards Safe, Explainable, and Regulated Autonomous Driving [article]

Shahin Atakishiyev, Mohammad Salameh, Hengshuai Yao, Randy Goebel
2022 arXiv   pre-print
(AI), especially in the applications of deep learning and reinforcement learning.  ...  We propose a framework that integrates autonomous control, explainable AI, and regulatory compliance to address this issue and validate the framework with a critical analysis in a case study.  ...  ACKNOWLEDGMENTS We acknowledge support from the Alberta Machine Intelligence Institute (Amii), from the Computing Science Department of the University of Alberta, and the Natural Sciences and Engineering  ... 
arXiv:2111.10518v3 fatcat:topadg7bp5enhflk7yqm6j27ga

Explainable artificial intelligence for autonomous driving: An overview and guide for future research directions [article]

Shahin Atakishiyev, Mohammad Salameh, Hengshuai Yao, Randy Goebel
2022 arXiv   pre-print
Hence, aside from making safe real-time decisions, the AI systems of autonomous vehicles also need to explain how their decisions are constructed in order to be regulatory compliant across many jurisdictions  ...  However, intelligent decision-making in autonomous cars is not generally understandable by humans in the current state of the art, and such deficiency hinders this technology from being socially acceptable  ...  ACKNOWLEDGMENT We acknowledge support from the Alberta Machine Intelligence Institute (Amii), from the Computing Science Department of the University of Alberta, and the Natural Sciences and Engineering  ... 
arXiv:2112.11561v2 fatcat:zluqlvmtznh25eihtouubib3ba

Hierarchical Reinforcement Learning: A Survey and Open Research Challenges

Matthias Hutsebaut-Buysse, Kevin Mets, Steven Latré
2022 Machine Learning and Knowledge Extraction  
We then introduce the Options framework, which provides a more generic approach, allowing abstractions to be discovered and learned semi-automatically.  ...  In order to further advance the development of HRL agents, capable of simultaneously learning abstractions and how to use them, solely from interaction with complex high dimensional environments, we also  ...  [212] introduced the Proximal Policy Option-Critic (PPOC) architecture.  ... 
doi:10.3390/make4010009 fatcat:emexhacqtvgdbelvbufusneira

Compositional Transfer in Hierarchical Reinforcement Learning [article]

Markus Wulfmeier, Abbas Abdolmaleki, Roland Hafner, Jost Tobias Springenberg, Michael Neunert, Tim Hertweck, Thomas Lampe, Noah Siegel, Nicolas Heess, Martin Riedmiller
2020 arXiv   pre-print
The presented algorithm enables stable and fast learning for complex, real-world domains in the parallel multitask and sequential transfer case.  ...  The successful application of general reinforcement learning algorithms to real-world robotics applications is often limited by their high data requirements.  ...  infrastructure side as well as many others of the DeepMind team.  ... 
arXiv:1906.11228v3 fatcat:ipw6uxy4lzbbvizfhmuggj553m

Reset-Free Lifelong Learning with Skill-Space Planning [article]

Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch
2021 arXiv   pre-print
We learn the skills in an unsupervised manner using intrinsic rewards and plan over the learned skills using a learned dynamics model.  ...  The objective of lifelong reinforcement learning (RL) is to optimize agents which can continuously adapt and interact in changing environments.  ...  Variational option discovery  ... 
arXiv:2012.03548v3 fatcat:ri3ayz6kyvfonhc4vqrripg7si

Deep Reinforcement Learning [article]

Yuxi Li
2018 arXiv   pre-print
We discuss deep reinforcement learning in an overview style. We draw a big picture, filled with details.  ...  We discuss six core elements, six important mechanisms, and twelve applications, focusing on contemporary work, and in historical contexts.  ...  Lanctot et al. (2017) observe that independent RL, in which each agent learns by interacting with the environment, oblivious to other agents, can overfit the learned policies to other agents' policies  ... 
arXiv:1810.06339v1 fatcat:kp7atz5pdbeqta352e6b3nmuhy

A Survey of Deep Reinforcement Learning in Video Games [article]

Kun Shao, Zhentao Tang, Yuanheng Zhu, Nannan Li, Dongbin Zhao
2019 arXiv   pre-print
This learning mechanism updates the policy to maximize the return with an end-to-end method.  ...  In this paper, we survey the progress of DRL methods, including value-based, policy gradient, and model-based algorithms, and compare their main techniques and properties.  ...  ACKNOWLEDGMENT The authors would like to thank Qichao Zhang, Dong Li and Weifan Li for the helpful comments and discussions about this work.  ... 
arXiv:1912.10944v2 fatcat:fsuzp2sjrfcgfkyclrsyzflax4

The Multi-Dimensional Actions Control Approach for Obstacle Avoidance Based on Reinforcement Learning

Menghao Wu, Yanbin Gao, Pengfei Wang, Fan Zhang, Zhejun Liu
2021 Symmetry  
This paper proposes the multi-dimensional action control (MDAC) approach based on a reinforcement learning technique, which can be used in multiple continuous action space tasks.  ...  Training the control policy with a reinforcement learning method is a trend.  ...  Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/sym13081335 fatcat:gvq2pt6ayfdarbegmgzh7eyvzy

A public-key based secure mobile IP

John Zao, Stephen Kent, Joshua Gahm, Gregory Troxel, Matthew Condell, Pam Helinek, Nina Yuan, Isidro Castineyra
1997 Proceedings of the 3rd annual ACM/IEEE international conference on Mobile computing and networking - MobiCom '97  
In this paper, we present the design and the first implementation of a public hey management system that can be used with IETF Mobile IP.  ...  The system, called the Mobile IP Security (MoIPS) system, was built upon a DNS based X.509 Public Key Infastructure with innovation in certificate and CRL dispatch as well as light-weight hey generation  ...  CertPolicy [optional, critical] allows the inclusion of policy specific information in the certificates in order to enforce Mobile IP access control policies. 3.  ... 
doi:10.1145/262116.262145 dblp:conf/mobicom/ZaoKGTCHYC97 fatcat:xbxljimgg5bb3avwnvjeoghgmm

Model-based Reinforcement Learning: A Survey [article]

Thomas M. Moerland, Joost Broekens, Aske Plaat, Catholijn M. Jonker
2022 arXiv   pre-print
, and how to integrate planning in the learning and acting loop.  ...  Along the way, the survey also draws connections to several related RL fields, like hierarchical RL and transfer learning.  ...  Examples are the Option-Critic (Bacon et al., 2017; Riemer et al., 2018) and Feudal Networks (Vezhnevets et al., 2017) , where the E-step infers which options are active, and the M-step maximizes with  ... 
arXiv:2006.16712v4 fatcat:qyb4auoqovdeji4ov65sv6f3fq

Deep reinforcement learning for high-level behavior decision making [article]

Florian Dittrich, Universität Stuttgart
The results support the usage of multiple levels of non-linearities, as a linear variant of the DQN is not capable of learning effective policies in our experiments.  ...  Additionally, the necessity of multiple layers of non-liniearities in the DQN algorithm is empirically evaluated using our scenarios.  ...  Initially, the aim was to learn both a high-level policy, as well as low-level policies using the Option-Critic Architecture [BHP17] .  ... 
doi:10.18419/opus-11978 fatcat:knewrvohtzewvcv25ruvdnhdvu

A contribution to architectural/engineered design for timber structures using knowledge-based methods

Robert John Taylor
This thesis attempts to synthesize knowledge from the fields of architecture, engineering, and computer science in the context of design.  ...  In particular, a novel approach to modeling the architectural and engineering design of structural connections is presented.  ...  Performance can be measured on the basis of simple demerit rating; the higher the rating from Option criticism, the poorer the Option is in overall performance.  ... 
doi:10.14288/1.0050389 fatcat:fh5gyu3ur5bllh7cbvmgbb57rm

The politics of downtown development: dynamic political cultures in San Francisco and Washington, D.C

1999 ChoiceReviews  
As time passed, citizens were more likely to evaluate their interests and options critically and take a more assertive role in making decisions about the future of their community.  ...  Architecture critics praised the plan for promoting innovative designs to make downtown a more hospitable environment and for preserving historically significant buildings.  ...  Franklin Square was rejuvenated with large, new office buildings and lunchtime concerts in the park. The Washington Convention Center opened in 1982 at 9th Street and New York Avenue.  ... 
doi:10.5860/choice.36-3613 fatcat:f4yji6gxszhznnh7gtknemlmoa


2016 Diabetes  
In conclusion, the cell transplantation using NCs derived from dermal spheroids was safe and effective on DPN, regardless of glucose metabolic conditions of donors.  ...  Specifi cally, the genes IL-2, IL-13, IL-17a, and Csf2 were enriched in infl ammation-associated pathways at 8 weeks in DRG and at 24 weeks in SCN.  ...  Leptin acts via leptin receptor (LepRb)-expressing neurons in the brain to connect physiology and behavior control to the repletion of fat stores.  ... 
doi:10.2337/db16-1-381 fatcat:vja5hvr5izbylihops7vrpma7q
« Previous Showing results 1 — 15 out of 24 results