623 Hits in 4.8 sec

The Curse of Passive Data Collection in Batch Reinforcement Learning [article]

Chenjun Xiao, Ilbin Lee, Bo Dai, Dale Schuurmans, Csaba Szepesvari
2022 arXiv   pre-print
While in simple cases, such as in bandits, passive and active data collection are similarly effective, the price of passive sampling can be much higher when collecting data from a system with controlled  ...  In high stake applications, active experimentation may be considered too risky and thus data are often collected passively.  ...  Csaba Szepesvári and Dale Schuurmans gratefully acknowledge funding from the Canada CIFAR AI Chairs Program, Amii and NSERC.  ... 
arXiv:2106.09973v2 fatcat:hhfxjvkwuvb2xctrznit6pzuju

A Brief Survey of Machine Learning Methods for Emotion Prediction using Physiological Data [article]

Maryam Khalid, Emily Willis
2022 arXiv   pre-print
Comparing regression, long short-term memory (LSTM) networks, convolutional neural networks (CNN), reinforcement online learning (ROL), and deep belief networks (DBN), we showcase the variability of machine  ...  Among other data sources, physiological data can serve as an indicator for emotions with an added advantage that it cannot be masked/tampered by the individual and can be easily collected.  ...  Reinforcement Online Learning The work presented in [11] combines online learning with reinforcement learning. Since these methods were not discussed in class, we first present a brief summary.  ... 
arXiv:2201.06610v1 fatcat:ll5m44wev5hslcrcmrmp67vqmi

Review, Analyze, and Design a Comprehensive Deep Reinforcement Learning Framework [article]

Ngoc Duy Nguyen, Thanh Thi Nguyen, Hai Nguyen, Saeid Nahavandi
2020 arXiv   pre-print
Reinforcement learning (RL) has emerged as a standard approach for building an intelligent system, which involves multiple self-operated agents to collectively accomplish a designated task.  ...  More importantly, there has been a great attention to RL since the introduction of deep learning that essentially makes RL feasible to operate in high-dimensional environments.  ...  In this paper, however, we consider a deep neural network as the decision model. The previous diagram infers that RL is online learning because the model is updated with incoming data.  ... 
arXiv:2002.11883v1 fatcat:yziq6kwryvh5hiwjm6ju2r5srq

Financial Portfolio Optimization with Online Deep Reinforcement Learning and Restricted Stacked Autoencoder - DeepBreath

Farzan Soleymani, Eric Paquet
2020 Expert systems with applications  
The framework consists of both offline and online learning strategies: the former is required to train the CNN while the latter handles concept drifts i.e. a change in the data distribution resulting from  ...  These are based on passive concept drift detection and online stochastic batching.  ...  Declaration of Competing Interest All authors have participated in (a) conception and design, or analysis and interpretation of the data; (b) drafting the article or revising it critically for important  ... 
doi:10.1016/j.eswa.2020.113456 fatcat:pntmi7qawrdcdnvfgqzybqr5gu

Scalable Traffic Signal Controls Using Fog-Cloud Based Multiagent Reinforcement Learning

Paul (Young Joun) Ha, Sikai Chen, Runjia Du, Samuel Labi
2022 Computers  
Fortunately, recent studies have recognized the potential of exploiting advancements in deep and reinforcement learning to address this problem, and some preliminary successes have been achieved in this  ...  It has been shown in past research that it is feasible to optimize the operations of individual TSC systems or a small collection of such systems.  ...  The contents of this paper reflect the views of the authors, who are responsible for the facts and the accuracy of the data presented herein, and do not necessarily reflect the official views or policies  ... 
doi:10.3390/computers11030038 fatcat:d3zck3x5hvh27i4cqdmx77tnhe

On-board Deep Q-Network for UAV-assisted Online Power Transfer and Data Collection

Kai Li, Wei Ni, Eduardo Tovar, Abbas Jamalipour
2019 IEEE Transactions on Vehicular Technology  
A key challenge is online MPT and data collection in the presence of on-board control of a UAV (e.g., patrolling velocity) for preventing battery drainage and data queue overflow of the sensing devices  ...  Therefore, scheduling MPT and data collection online in the presence of on-board control of the UAV (e.g., patrolling velocity) for preventing battery drainage and data queue overflow is critical in UAV-assisted  ...  Acknowledgements This work was partially supported by National Funds through FCT/MCTES (Portuguese Foundation for Science and Technology), within the CISTER Research Unit (CEC/04234); also by the Operational  ... 
doi:10.1109/tvt.2019.2945037 fatcat:t2g73wxqirdh5nde32nwovepge

Batch-Constrained Reinforcement Learning for Dynamic Distribution Network Reconfiguration [article]

Yuanqi Gao, Wei Wang, Jie Shi, Nanpeng Yu
2020 arXiv   pre-print
To address these limitations, this paper develops a data-driven batch-constrained reinforcement learning (RL) algorithm for the dynamic DNR problem.  ...  the historical operational data.  ...  Batch-constrained Reinforcement Learning In the batch RL setup, the agent can only learn from a finite dataset collected by some sampling procedure.  ... 
arXiv:2006.12749v1 fatcat:4f7rb6f2fzeibn3fwxpsf4mjg4

Balancing a CartPole System with Reinforcement Learning – A Tutorial [article]

Swagat Kumar
2020 arXiv   pre-print
In this paper, we provide the details of implementing various reinforcement learning (RL) algorithms for controlling a Cart-Pole system.  ...  In particular, we describe various RL concepts such as Q-learning, Deep Q Networks (DQN), Double DQN, Dueling networks, (prioritized) experience replay and show their effect on the learning performance  ...  The replay memory size of 2000 and batch size of 24 is used for producing the result shown in 5.  ... 
arXiv:2006.04938v2 fatcat:kcgvwd2rlnggreyovr2aumaflu

A Fine-Grain Batching-Based Task Allocation Algorithm for Spatial Crowdsourcing

Yuxin Jiao, Zhikun Lin, Long Yu, Xiaozhu Wu
2022 ISPRS International Journal of Geo-Information  
Experiments on real data and synthetic data show that this method can accomplish the task assignment of spatial crowdsourcing effectively and can adapt to the non-stationary setting as soon as possible  ...  In addition, we also take into account the benefits of requesters, workers, and the platform.  ...  [21] customized a series of methods based on reinforcement learning to overcome the dimension curse and sparse reward problems.  ... 
doi:10.3390/ijgi11030203 fatcat:idipxyjpjnazdiupr6nok7pb3q

Using Deep Reinforcement Learning to Coordinate Multi-Modal Journey Planning with Limited Transportation Capacity

Lara Codeca, Vinny Cahill
2022 SUMO Conference Proceedings  
This paper assesses the viability of Deep Reinforcement Learning (DRL) applied to simulated mobility as a means of learning coordinated plans.  ...  The results show that the learned plans make intuitive use of the available modes and improve average travel time and lateness, supporting the use of DRL in association with a microscopic mobility simulator  ...  Acknowledgments This project has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 713567.  ... 
doi:10.52825/scp.v2i.89 fatcat:dfuein2tong3vksu7uao5es7vy

Reinforcement learning for control of flexibility providers in a residential microgrid

Brida Verwiyni Mbuwir, Davy Geysen, Fred Spiessens, Geert Deconinck
2019 IET Smart Grid  
This data is expected to assist in power system planning/operation and the transition from passive to active electricity users.  ...  With recent advances in machine learning, this data can be used to learn system dynamics.  ...  Acknowledgment This research is supported by Vlaamse Instelling voor Technologisch Onderzoek (VITO) and partly funded by IWTSBO-140047 SMILE-IT, the Flemish Agency for Innovation through Science and Technology  ... 
doi:10.1049/iet-stg.2019.0196 fatcat:xo2mcyp76ngtjkp7n4q2bs2pyq

A novel machine learning approach for database exploitation detection and privilege control

Chee Keong Wee, Richi Nayak
2019 Journal of Information and Telecommunication  
user activities through the analysis of sequences of user session data.  ...  This paper proposes a novel method to improve the security of a database by using machine learning to learn the user behaviour unique to a database environment and apply that learning to detect anomalous  ...  Richi Nayak is the HDR director of Electrical Engineering and Computer Science faculty in the Queensland University of Technology and the leader of the Applied Data Mining Research group.  ... 
doi:10.1080/24751839.2019.1570454 fatcat:zzzk4xv5dnhnteh6ma7umhepgu

SREC: Proactive Self-Remedy of Energy-Constrained UAV-Based Networks via Deep Reinforcement Learning [article]

Ran Zhang, Miao Wang, Lin X. Cai
2020 arXiv   pre-print
Specifically, a deep reinforcement learning (DRL)-based self remedy approach, named SREC-DRL, is proposed to maximize the accumulated user satisfaction scores for a certain period within which at least  ...  To handle the continuous state and action space in the problem, the state-of-the-art algorithm of the actor-critic DRL, i.e., deep deterministic policy gradient (DDPG), is applied with better convergence  ...  In various applications, UAVs serve as either relays to collect or disseminate data, or additional access points to improve the communication performance.  ... 
arXiv:2009.08528v1 fatcat:nymlq2r35nd47ksv2sswjkxxd4

Scalable multi-agent reinforcement learning for distributed control of residential energy flexibility [article]

Flora Charbonnier, Thomas Morstyn, Malcolm McCulloch
2022 arXiv   pre-print
In the standard independent Q-learning approach, the coordination performance of agents under partial observability drops at scale in stochastic environments.  ...  Here, the novel combination of learning from off-line convex optimisations on historical data and isolating marginal contributions to total rewards in reward signals increases stability and performance  ...  Acknowledgement This work was supported by the Saven European Scholarship and by the UK Research and Innovation and the Engineering and Physical Sciences Research Council (award references EP/S000887/1  ... 
arXiv:2203.03417v1 fatcat:kdjd6iotpvg4fcj3d5gsu4yr6m

Optimizing Intelligent Reflecting Surface-Base Station Association for Mobile Networks [article]

Dongzi Jin, Yong Xiao, Yingyu Li, Guangming Shi, Dusit Niyato
2021 arXiv   pre-print
We focus on the IRS-BS association problem in which multiple BSs compete with each other for controlling the phase shifts of a limited number of IRSs to maximize the long-term downlink data rate for the  ...  We propose MDLBI, a Multi-agent Deep Reinforcement Learning-based BS-IRS association scheme that optimizes the BS-IRS association as well as the phase-shift of each IRS when being associated with different  ...  The authors in [5] adopted a deep reinforcement learning method to optimize the transmit beamforming and reflect beamforming in dynamic environment.  ... 
arXiv:2106.12883v1 fatcat:wdjkfbejsnexnkx3q3kqlpfkqu
« Previous Showing results 1 — 15 out of 623 results