A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
A Unified q-Memorization Framework for Asynchronous Stochastic Optimization
2020
Journal of machine learning research
Specifically, based on the q-memorization framework, 1) we propose an asynchronous stochastic gradient hard thresholding algorithm with q-memorization (AsySGHT-qM) for the non-convex optimization with ...
In this paper, we bridge this gap by using an unified q-memorization framework for various variance reduction techniques (including SVRG, S2GD, SAGA, q-SAGA) to analyze asynchronous stochastic algorithms ...
based on the unified q-memorization framework. ...
dblp:journals/jmlr/GuXHDH20
fatcat:6mqu7l6jz5gtjkmrux5qihhoxu
Using Deep Reinforcement Learning for the Continuous Control of Robotic Arms
[article]
2018
arXiv
pre-print
The concluding evaluation proves the general applicability of the described concepts by testing them using a simulated environment. These concepts might be reused for future experiments. ...
We test a newly created combination of two commonly used reinforcement learning methods, whether it is able to learn more effectively than a baseline. ...
Both outputs are unified to form an estimate of the Q-value which is then used for training. ...
arXiv:1810.06746v1
fatcat:pz5xo5ezdnf57pfjlxzvjcifxa
2021 Index IEEE Transactions on Neural Networks and Learning Systems Vol. 32
2021
IEEE Transactions on Neural Networks and Learning Systems
The Author Index contains the primary entry for each item, listed under the first author's name. ...
., +, TNNLS Nov. 2021 5022-5033 A Unified Framework for Multilingual Speech Recognition in Air Traffic Control Systems. ...
., +, TNNLS July 2021 3156-3167 Decoding A Unified Framework for Multilingual Speech Recognition in Air Traffic Control Systems. ...
doi:10.1109/tnnls.2021.3134132
fatcat:2e7comcq2fhrziselptjubwjme
Learning RoboCup-Keepaway with Kernels
2007
Journal of machine learning research
We employ the general framework of approximate policy iteration with least-squares-based policy evaluation. ...
Key challenges in keepaway are the highdimensionality of the state space (rendering conventional discretization-based function approximation like tilecoding infeasible), the stochasticity due to noise ...
Acknowledgments The authors wish to thank the anonymous reviewers for their useful comments and suggestions. ...
dblp:journals/jmlr/JungP07
fatcat:vqrrlh2vxnbblbamesobl4uxmu
Learning RoboCup-Keepaway with Kernels
[article]
2012
arXiv
pre-print
We employ the general framework of approximate policy iteration with least-squares-based policy evaluation. ...
Key challenges in keepaway are the high-dimensionality of the state space (rendering conventional discretization-based function approximation like tilecoding infeasible), the stochasticity due to noise ...
Acknowledgments The authors wish to thank the anonymous reviewers for their useful comments and suggestions. ...
arXiv:1201.6626v1
fatcat:xsbsokwdyncink3unpvenjowoi
On Monte Carlo Tree Search and Reinforcement Learning
2017
The Journal of Artificial Intelligence Research
Our study promotes a unified view of learning, planning, and search. ...
We show that a straightforward adaptation of RL semantics within tree search can lead to a wealth of new algorithms, for which the traditional MCTS is only one of the variants. ...
We would like to thank also Michael Fairbank and Mark Nelson for providing very useful last-minute feedback. ...
doi:10.1613/jair.5507
fatcat:igffnyo5hfbyzigzxpp6t6pebi
Applications of Deep Reinforcement Learning in Communications and Networking: A Survey
[article]
2018
arXiv
pre-print
Furthermore, we present applications of deep reinforcement learning for traffic routing, resource sharing, and data collection. ...
This paper presents a comprehensive literature review on applications of deep reinforcement learning in communications and networking. ...
unified framework. ...
arXiv:1810.07862v1
fatcat:qc3mqk2norazvc2xnynau6bqzu
A Survey of Deep Reinforcement Learning in Video Games
[article]
2019
arXiv
pre-print
A large number of video game AIs with DRL have achieved super-human performance, while there are still some challenges in this domain. ...
We also take a review of the achievements of DRL in various video games, including classical Arcade games, first-person perspective games and multi-agent real-time strategy games, from 2D to 3D, and from ...
ACKNOWLEDGMENT The authors would like to thank Qichao Zhang, Dong Li and Weifan Li for the helpful comments and discussions about this work. ...
arXiv:1912.10944v2
fatcat:fsuzp2sjrfcgfkyclrsyzflax4
Applications of Deep Reinforcement Learning in Communications and Networking: A Survey
2019
IEEE Communications Surveys and Tutorials
Furthermore, we present applications of deep reinforcement learning for traffic routing, resource sharing, and data collection. ...
This paper presents a comprehensive literature review on applications of deep reinforcement learning in communications and networking. ...
unified framework. ...
doi:10.1109/comst.2019.2916583
fatcat:5owsswhhrbctnirdtxre6mhv24
Zero-touch Continuous Network Slicing Control via Scalable Actor-Critic Learning
[article]
2021
arXiv
pre-print
knowledge learned in the past to solve future problems and re-configure computing resources autonomously while minimizing latency, energy consumption, and virtual network function (VNF) instantiation cost for ...
The paper defines and corroborates via extensive experimental results a zero-touch network slicing scheme with a multi-objective approach where the central server learns continuously to accumulate the ...
Furthermore, we use actor-critic architecture and pursue a joint policy and state-action return distribution optimization where we propose a prioritized asynchronous actor-learner optimized for the network ...
arXiv:2101.06654v1
fatcat:ijtxmfamifdvno2vjkjcqb2m5e
Reducing Noise in GAN Training with Variance Reduced Extragradient
[article]
2020
arXiv
pre-print
We address this issue with a novel stochastic variance-reduced extragradient (SVRE) optimization algorithm, which for a large class of games improves upon the previous convergence rates proposed in the ...
We study the effect of the stochastic gradient noise on the training of generative adversarial networks (GANs) and show that it can prevent the convergence of standard game optimization methods, while ...
Grant RGPIN-2017-06936, by the Hasler Foundation through the MEMUDE project, and by a Google ...
arXiv:1904.08598v3
fatcat:gotf432ufbfg3dhsdbu6lwabvi
Policy Message Passing: A New Algorithm for Probabilistic Graph Inference
[article]
2019
arXiv
pre-print
In this paper, we present the Policy Message Passing algorithm, which takes a probabilistic perspective and reformulates the whole information aggregation as stochastic sequential processes. ...
A general graph-structured neural network architecture operates on graphs through two core components: (1) complex enough message functions; (2) a fixed information aggregation process. ...
We adopt the standard variational inference method to derive the following equations: log p(y|s) − KL(q(τ |y, s)||π * (τ |y, s)) = E τ ∼q log p(y|τ , s) − KL(q(τ |s, y)||π(τ |s)) (9) We can optimize the ...
arXiv:1909.13196v1
fatcat:ah2urnhnvjhjleibyik6iuqvuy
Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning
[article]
2021
arXiv
pre-print
This article provides a concise but holistic review of the recent advances made in using machine learning to achieve safe decision making under uncertainties, with a focus on unifying the language and ...
frameworks used in control theory and reinforcement learning research. ...
In the proposed approach, the last layer of the DNN is updated at a higher frequency for fast adaptation, while the inner layers are updated at a lower frequency to "memorize" pertinent features for the ...
arXiv:2108.06266v2
fatcat:gbbe3qyatfgelgzhqzglecr5qm
RLOC: Terrain-Aware Legged Locomotion using Reinforcement Learning and Optimal Control
[article]
2020
arXiv
pre-print
We present a unified model-based and data-driven approach for quadrupedal planning and control to achieve dynamic locomotion over uneven terrain. ...
We train and evaluate our framework on a complex quadrupedal system, ANYmal version B, and demonstrate transferability to a larger and heavier robot, ANYmal C, without requiring retraining. ...
In this work, we propose a unified RL and optimal control (OC) based terrain-aware legged locomotion framework as illustrated in Fig. 3 . ...
arXiv:2012.03094v1
fatcat:ynp2g3ng6rbe3jsmqopz6rrwu4
Robust Temporal Ensembling for Learning with Noisy Labels
[article]
2021
arXiv
pre-print
Finally, we show that RTE also retains competitive corruption robustness to unforeseen input noise using CIFAR-10-C, obtaining a mean corruption error (mCE) of 13.50% even in the presence of an 80% noise ...
simple asynchronous optimisation algorithm that jointly optimize a population of models and their hyperparameters. ...
In this way, the ECR term can loosely be seen as generating a set of stochastic differential constraints at each optimization step of the classification task loss. ...
arXiv:2109.14563v1
fatcat:7xyyzhv3abgcrn6nj7xpiehdbq
« Previous
Showing results 1 — 15 out of 235 results