A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
ACS Central Science Virtual Issue on Machine Learning
2018
ACS Central Science
learning to predict the products of organic reactions, Pande and co-workers use recurrent neural networks for retrosynthetic reactant prediction, 12 and Zare and co-workers use deep reinforcement learning ...
This program was provided only with the rules of the ancient board game and learned to play by playing games against itself in a form of reinforcement learning. 2 After just 3 days of training, AlphaGo ...
doi:10.1021/acscentsci.8b00528
pmid:30159387
pmcid:PMC6107860
fatcat:qtism3iabbbsvg5osjofishkj4
A State Aggregation Approach for Solving Knapsack Problem with Deep Reinforcement Learning
[article]
2020
arXiv
pre-print
This paper proposes a Deep Reinforcement Learning (DRL) approach for solving knapsack problem. ...
The proposed method consists of a state aggregation step based on tabular reinforcement learning to extract features and construct states. ...
Our method for solving this variant of KP is based on deep reinforcement learning. ...
arXiv:2004.12117v1
fatcat:dh3g4lvscbfwzgvm2tpipd2pum
Deep Reinforcement Learning for Inventory Control: a Roadmap
2021
European Journal of Operational Research
Abstract Deep reinforcement learning (DRL) has shown great potential for sequential decision-making, including early developments in inventory control. ...
Highlights • Roadmap on how inventory control may benefit from deep reinforcement learning. • We describe the key design choices of learning algorithms for inventory applications. • We shed light on future ...
Introduction Reinforcement learning (RL) is an area of machine learning that focuses on sequential decision-making. ...
doi:10.1016/j.ejor.2021.07.016
fatcat:pctc2mgiorhhvls4rgsikc5l4u
Deep Reinforcement Learning for Inventory Control: A Roadmap
2021
Social Science Research Network
Deep reinforcement learning (DRL) has shown great potential for sequential decision-making, including early developments in inventory control. ...
Yet, the abundance of choices that come with designing a DRL algorithm, combined with the intense computational effort to tune and evaluate each choice, may hamper their application in practice. ...
Introduction Reinforcement learning (RL) is an area of machine learning that focuses on sequential decision-making. ...
doi:10.2139/ssrn.3861821
fatcat:gav75rsk6nc2rmb7legigyorru
User Response Prediction in Online Advertising
[article]
2021
arXiv
pre-print
However, existing literature mainly focuses on algorithmic-driven designs to solve specific challenges, and no comprehensive review exists to answer many important questions. ...
Recent years have witnessed a significant increase in the number of studies using computational approaches, including machine learning methods, for user response prediction. ...
immediate advertising revenue and long-run user experience Joint optimization using two-level reinforcement learning Recomm. ...
arXiv:2101.02342v2
fatcat:clgefamcd5fmbeg5ephizy3zqu
A Tutorial on Thompson Sampling
2018
Foundations and Trends® in Machine Learning
with neural networks, and reinforcement learning in Markov decision processes. ...
Thompson sampling is an algorithm for online decision problems where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and ...
Indeed, by sampling action 0 in the first period, the decision maker immediately learns the value of θ, and can exploit that knowledge to play the optimal action in all subsequent periods. ...
doi:10.1561/2200000070
fatcat:glgeh2ghmfaxzmzbxsm3zedipu
Active Learning: Approaches and Issues
1997
Journal of Intelligent Systems
There are three major recognised approaches to the implementation of active learninggoal-driven learning, reinforcement learning and querying. ...
This paper surveys published work in active learning research with the purpose of providing a unified understanding of the area. ...
Learning regular sets from queries and examples, ...
doi:10.1515/jisys.1997.7.3-4.205
fatcat:stjvzotm7renhnty3ktfhzvvgm
Putting hands to rest: efficient deep CNN-RNN architecture for chemical named entity recognition with no hand-crafted rules
2018
Journal of Cheminformatics
algorithms, such as deep neural networks, can automatically design the rules with little to none human intervention. ...
Here we explored this approach by experimenting with various deep learning architectures for targeted tokenisation and named entity recognition. ...
Design choices Deep-learning models. ...
doi:10.1186/s13321-018-0280-0
pmid:29796778
pmcid:PMC5966369
fatcat:47rjjknesragtjuwapu3jqiz6m
Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising
[article]
2020
arXiv
pre-print
To improve the exploration efficiency of reinforcement learning, we also devise an effective action space reduction approach. ...
In this paper, we formulate the sequential advertising strategy optimization as a dynamic knapsack problem. ...
The MDP can be solved using existing Deep Reinforcement Learning (DRL) algorithms such as DQN (Mnih et al., 2013) , DDPG (Lillicrap et al., 2015) and PPO (Schulman et al., 2017) . ...
arXiv:2006.16312v1
fatcat:j2lh2tta3je6lc3z5jva2f5fee
Emergent Graphical Conventions in a Visual Communication Game
[article]
2021
arXiv
pre-print
We devise a novel reinforcement learning method such that agents are evolved jointly towards successful communication and abstract graphical conventions. ...
To inspect the emerged conventions, we carefully define three key properties -- iconicity, symbolicity, and semanticity -- and design evaluation methods accordingly. ...
Learning to communicate with deep national Conference on Learning Representations (ICLR),
multi-agent reinforcement learning. ...
arXiv:2111.14210v2
fatcat:e75iuvjbazbypcmasfr7ah4p6a
Putting hands to rest: efficient deep CNN-RNN architecture for chemical named entity recognition with no hand-crafted rules
[article]
2018
bioRxiv
pre-print
algorithms, such as deep neural networks, can automatically design the rules with little to none human intervention. ...
Here we explored this approach by experimenting with various deep learning architectures for targeted tokenisation and named entity recognition. ...
Design choices Deep-learning models. ...
doi:10.1101/321224
fatcat:6w4immj2ynhjxe4l3r6lpazoi4
Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference
[article]
2020
arXiv
pre-print
While increasingly deep networks are still in general desired for achieving state-of-the-art performance, for many specific inputs a simpler network might already suffice. ...
Existing works exploited this observation by learning to skip convolutional layers in an input-dependent manner. ...
The work of SkipNet uses a hybrid learning algorithm that sequentially performs supervised pretraining and reinforcement fine-tuning, achieving better resource saving and accuracy tradeoff than existing ...
arXiv:2001.00705v1
fatcat:72xef5fl4bhwxcwdkprrrsqz4i
Interactive evolutionary computation: fusion of the capabilities of EC optimization and human evaluation
2001
Proceedings of the IEEE
In this survey, the IEC application fields include graphic arts and animation, 3-D CG lighting, music, editorial design, industrial design, facial image generation, speech processing and synthesis, hearing ...
The IEC is an EC that optimizes systems based on subjective human evaluation. The definition and features of the IEC are first described and then followed by an overview of the IEC research. ...
Acknowledgements This survey was completed with the help of several people who provided information on IEC papers, sent the papers, commented on this article, helped build a database, and prepared this ...
doi:10.1109/5.949485
fatcat:jwqoptm2zjeovolfhkrqsoqpfi
Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference
2020
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
While increasingly deep networks are still in general desired for achieving state-of-the-art performance, for many specific inputs a simpler network might already suffice. ...
Existing works exploited this observation by learning to skip convolutional layers in an input-dependent manner. ...
The work of SkipNet uses a hybrid learning algorithm that sequentially performs supervised pretraining and reinforcement fine-tuning, achieving better resource saving and accuracy tradeoff than existing ...
doi:10.1609/aaai.v34i04.6025
fatcat:mp66ycyvdzap7bnrdd2vegprpm
Relational Neurogenesis for Lifelong Learning Agents
2020
Proceedings of the Neuro-inspired Computational Elements Workshop
The ability to learn through continuous reinforcement and interaction with an environment negates the requirement of painstakingly curated datasets and hand crafted features. ...
However, the ability to learn multiple tasks in a sequential manner, referred to as lifelong or continual learning, remains unresolved. ...
These networks were most uncreatively named Deep Q-Learning Networks (DQN), and the technique was called Deep Reinforcement
the experiments carried out by DQN. ...
doi:10.1145/3381755.3381766
dblp:conf/nice/PanditK20
fatcat:flknsyjdprbdxg76upjzpdacju
« Previous
Showing results 1 — 15 out of 1,387 results