35,860 Hits in 5.4 sec

Generative Adversarial Exploration for Reinforcement Learning [article]

Weijun Hong, Menghui Zhu, Minghuan Liu, Weinan Zhang, Ming Zhou, Yong Yu, Peng Sun
2022 arXiv   pre-print
Exploration is crucial for training the optimal reinforcement learning (RL) policy, where the key is to discriminate whether a state visiting is novel.  ...  In this paper, we propose a novel method called generative adversarial exploration (GAEX) to encourage exploration in RL via introducing an intrinsic reward output from a generative adversarial network  ...  CONCLUSION This paper provides a novel exploration framework, Generative Adversarial Exploration (GAEX), to address the exploration-exploitation dilemma in deep reinforcement learning.  ... 
arXiv:2201.11685v1 fatcat:xseso2ciaffijczbc573vnigqi

Risk Averse Robust Adversarial Reinforcement Learning [article]

Xinlei Pan, Daniel Seita, Yang Gao, John Canny
2019 arXiv   pre-print
In this paper we introduce risk-averse robust adversarial reinforcement learning (RARARL), using a risk-averse protagonist and a risk-seeking adversary.  ...  A classical technique for improving the robustness of reinforcement learning algorithms is to train on a set of randomized environments, but this approach only guards against common situations.  ...  More generally, robustness and safety have long been explored in reinforcement learning [21] , [22] , [23] . Chow et al.  ... 
arXiv:1904.00511v1 fatcat:5jwf2nreefe7bjgpli7leupazi

DANCin SEQ2SEQ: Fooling Text Classifiers with Adversarial Text Example Generation [article]

Catherine Wong
2017 arXiv   pre-print
Despite significant recent work on adversarial example generation targeting image classifiers, relatively little work exists exploring adversarial example generation for text classifiers; additionally,  ...  We recast adversarial text example generation as a reinforcement learning problem, and demonstrate that our algorithm offers preliminary but promising steps towards generating semantically meaningful adversarial  ...  Acknowledgments Many thanks to Will Monroe for his crackerjack adversarial text generation advice and expertise, and for sharing an alarming series of articles about 3D-printed turtles misclassified as  ... 
arXiv:1712.05419v1 fatcat:ccrkfg4nargw3hctkdi6iispym

Deep Adversarial Reinforcement Learning for Object Disentangling [article]

Melvin Laux, Oleg Arenz, Jan Peters, Joni Pajarinen
2021 arXiv   pre-print
To solve this problem, we present a novel adversarial reinforcement learning (ARL) framework.  ...  Deep learning in combination with improved training techniques and high computational power has led to recent advances in the field of reinforcement learning (RL) and to successful robotic RL applications  ...  We present a novel adversarial learning framework for reinforcement learning algorithms: Adversarial reinforcement learning (ARL).  ... 
arXiv:2003.03779v2 fatcat:ekewmbpqi5bitiw2y34tkppnui

LexicalAT: Lexical-Based Adversarial Reinforcement Training for Robust Sentiment Classification

Jingjing Xu, Liang Zhao, Hanqi Yan, Qi Zeng, Yun Liang, Xu SUN
2019 Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)  
Due to the discrete generation step in the generator, we use policy gradient, a reinforcement learning approach, to train the two modules.  ...  The proposed approach consists of a generator and a classifier. The generator learns to generate examples to attack the classifier while the classifier learns to defend these attacks.  ...  Acknowledgments We thank all reviewers for providing the thoughtful and constructive suggestions. This work was supported in part by National Natural Science Foundation of China (No. 61673028).  ... 
doi:10.18653/v1/d19-1554 dblp:conf/emnlp/XuZYZLS19 fatcat:czux2pcmp5dwdczl4t7xbt2c5e

Adaptive perturbation adversarial training: based on reinforcement learning [article]

Zhishen Nie, Ying Lin, Sp Ren, Lan Zhang
2021 arXiv   pre-print
This paper proposes a method for finding marginal adversarial samples based on reinforcement learning, and combines it with the latest fast adversarial training technology, which effectively speeds up  ...  However, searching for marginal adversarial samples brings additional computational costs.  ...  Algorithm 1 describes the marginal adversarial sample search based on reinforcement learning.  ... 
arXiv:2108.13239v1 fatcat:t5z5jshnybd37f75txuwwkczxy

MedAttacker: Exploring Black-Box Adversarial Attacks on Risk Prediction Models in Healthcare [article]

Muchao Ye and Junyu Luo and Guanjie Zheng and Cao Xiao and Ting Wang and Fenglong Ma
2021 arXiv   pre-print
MedAttacker addresses the challenges brought by EHR data via two steps: hierarchical position selection which selects the attacked positions in a reinforcement learning (RL) framework and substitute selection  ...  In addition, based on the experiment results we include a discussion on defending EHR adversarial attacks.  ...  Thus, in this complex situation, MedAttacker is able to bring out the potential of reinforcement learning and generate more adversarial examples to successfully fool the victim models.  ... 
arXiv:2112.06063v1 fatcat:jiiues7sere6lfjs2x2d76vy5m

Adversarial recovery of agent rewards from latent spaces of the limit order book [article]

Jacobo Roa-Vicens, Yuanbo Wang, Virgile Mison, Yarin Gal, Ricardo Silva
2019 arXiv   pre-print
Inverse reinforcement learning has proved its ability to explain state-action trajectories of expert agents by recovering their underlying reward functions in increasingly challenging environments.  ...  In this paper, we explore whether adversarial inverse RL algorithms can be adapted and trained within such latent space simulations from real market data, while maintaining their ability to recover agent  ...  Adversarial learning from Reinforcement Learning trading experts As in general GAN structures, adversarial inverse reinforcement learning is implemented through a generator and a discriminator: two neural  ... 
arXiv:1912.04242v1 fatcat:zfbo3bhxc5akzjvdgisb2ib3f4

Solving reward-collecting problems with UAVs: a comparison of online optimization and Q-learning [article]

Yixuan Liu and Chrysafis Vogiatzis and Ruriko Yoshida and Erich Morman
2021 arXiv   pre-print
We present a comparison of three methods to solve this problem: namely we implement a Deep Q-Learning model, an ε-greedy tabular Q-Learning model, and an online optimization framework.  ...  As the prevalence of UAVs increases, there has also been improvements in counter-UAV technology that makes it difficult for them to successfully obtain valuable intelligence within an area of interest.  ...  Erich Morman: Modeled and implemented the ε-greedy tabular Q-Learning. Additionally conducted computational experiments using -Q learning.  ... 
arXiv:2112.00141v1 fatcat:motshvd4qrfvphe2hnoppyib4q

Domain Adaptation for Reinforcement Learning on the Atari [article]

Thomas Carr, Maria Chli, George Vogiatzis
2018 arXiv   pre-print
This is borne out by the fact that a reinforcement learning agent has no prior knowledge of the world, no pre-existing data to depend on and so must devote considerable time to exploration.  ...  We demonstrate that this initialisation step provides significant improvement when learning a new reinforcement learning task, which highlights the wide applicability of adversarial adaptation methods;  ...  Conclusion We have presented an adversarial method for knowledge transfer in reinforcement learning.  ... 
arXiv:1812.07452v1 fatcat:mdhunianpjdvbn2u6nmqrxn7ta

Adversarial Advantage Actor-Critic Model for Task-Completion Dialogue Policy Learning [article]

Baolin Peng and Xiujun Li and Jianfeng Gao and Jingjing Liu and Yun-Nung Chen and Kam-Fai Wong
2018 arXiv   pre-print
This paper presents a new method --- adversarial advantage actor-critic (Adversarial A2C), which significantly improves the efficiency of dialogue policy learning in task-completion dialogue systems.  ...  Inspired by generative adversarial networks (GAN), we train a discriminator to differentiate responses/actions generated by dialogue agents from responses/actions by experts.  ...  Ho and Ermon also drew a connection between inverse reinforcement learning and generative adversarial networks to learn the reward function in the GAN framework [23] .  ... 
arXiv:1710.11277v2 fatcat:sydo2iblv5d6fmip5pxfpvwakq

A Reinforced Generation of Adversarial Examples for Neural Machine Translation [article]

Wei Zou, Shujian Huang, Jun Xie, Xinyu Dai, Jiajun Chen
2020 arXiv   pre-print
Instead of collecting and analyzing bad cases using limited handcrafted error features, here we investigate this issue by generating adversarial examples via a new paradigm based on reinforcement learning  ...  The results show that our method efficiently produces stable attacks with meaning-preserving adversarial examples.  ...  Acknowledgement We would like to thank the anonymous reviewers for their insightful comments. Shujian Huang is the corresponding author.  ... 
arXiv:1911.03677v2 fatcat:pdc76yp2trhgnffths76i4unce

Adversarial attack and defense in reinforcement learning-from AI security view

Tong Chen, Jiqiang Liu, Yingxiao Xiang, Wenjia Niu, Endong Tong, Zhen Han
2019 Cybersecurity  
Reinforcement learning is a core technology for modern artificial intelligence, and it has become a workhorse for AI applications ranging from Atrai Game to Connected and Automated Vehicle System (CAV)  ...  Hence, in this paper, we give the very first attempt to conduct a comprehensive survey on adversarial attacks in reinforcement learning under AI security.  ...  applications in AI. ( I) They gave the very first attempt to prove that reinforcement learning systems are vulnerable to adversarial attack, and the traditional generation algorithms designed for adversarial  ... 
doi:10.1186/s42400-019-0027-x fatcat:nlox7arfojaerietjz5ipskucm

Adversarial Deep Reinforcement Learning based Adaptive Moving Target Defense [article]

Taha Eghtesad, Yevgeniy Vorobeychik, Aron Laszka
2020 arXiv   pre-print
Based on an established model of adaptive MTD, we propose a multi-agent reinforcement learning framework based on the double oracle algorithm to solve the game.  ...  Finding an optimal strategy for MTD presents a significant challenge, especially when facing a resourceful and determined adversary who may respond to the defender's actions.  ...  In general, traditional reinforcement learning techniques use tabular approaches to store estimated rewards (e.g., Q-Learning) [11] .  ... 
arXiv:1911.11972v2 fatcat:qszyzcpwvba3fmwrd3qjtyvefe

Adversarial Attacks on Neural Network Policies [article]

Sandy Huang, Nicolas Papernot, Ian Goodfellow, Yan Duan, Pieter Abbeel
2017 arXiv   pre-print
In this work, we show adversarial attacks are also effective when targeting neural network policies in reinforcement learning.  ...  We characterize the degree of vulnerability across tasks and training algorithms, for a subclass of adversarial-example attacks in white-box and black-box settings.  ...  Transferability Across Policies To explore transferability of adversarial examples across policies, we generate adversarial perturbations for the target policy using one of the other top-performing policies  ... 
arXiv:1702.02284v1 fatcat:iszu636xnjedxe2ponhr7tuxiu
« Previous Showing results 1 — 15 out of 35,860 results