Filters








1,387 Hits in 4.5 sec

ACS Central Science Virtual Issue on Machine Learning

Andrew L. Ferguson
2018 ACS Central Science  
learning to predict the products of organic reactions, Pande and co-workers use recurrent neural networks for retrosynthetic reactant prediction, 12 and Zare and co-workers use deep reinforcement learning  ...  This program was provided only with the rules of the ancient board game and learned to play by playing games against itself in a form of reinforcement learning. 2 After just 3 days of training, AlphaGo  ... 
doi:10.1021/acscentsci.8b00528 pmid:30159387 pmcid:PMC6107860 fatcat:qtism3iabbbsvg5osjofishkj4

A State Aggregation Approach for Solving Knapsack Problem with Deep Reinforcement Learning [article]

Reza Refaei Afshar and Yingqian Zhang and Murat Firat and Uzay Kaymak
2020 arXiv   pre-print
This paper proposes a Deep Reinforcement Learning (DRL) approach for solving knapsack problem.  ...  The proposed method consists of a state aggregation step based on tabular reinforcement learning to extract features and construct states.  ...  Our method for solving this variant of KP is based on deep reinforcement learning.  ... 
arXiv:2004.12117v1 fatcat:dh3g4lvscbfwzgvm2tpipd2pum

Deep Reinforcement Learning for Inventory Control: a Roadmap

Robert N. Boute, Joren Gijsbrechts, Willem van Jaarsveld, Nathalie Vanvuchelen
2021 European Journal of Operational Research  
Abstract Deep reinforcement learning (DRL) has shown great potential for sequential decision-making, including early developments in inventory control.  ...  Highlights • Roadmap on how inventory control may benefit from deep reinforcement learning. • We describe the key design choices of learning algorithms for inventory applications. • We shed light on future  ...  Introduction Reinforcement learning (RL) is an area of machine learning that focuses on sequential decision-making.  ... 
doi:10.1016/j.ejor.2021.07.016 fatcat:pctc2mgiorhhvls4rgsikc5l4u

Deep Reinforcement Learning for Inventory Control: A Roadmap

Robert N. Boute, Joren Gijsbrechts, Willem van Jaarsveld, Nathalie Vanvuchelen
2021 Social Science Research Network  
Deep reinforcement learning (DRL) has shown great potential for sequential decision-making, including early developments in inventory control.  ...  Yet, the abundance of choices that come with designing a DRL algorithm, combined with the intense computational effort to tune and evaluate each choice, may hamper their application in practice.  ...  Introduction Reinforcement learning (RL) is an area of machine learning that focuses on sequential decision-making.  ... 
doi:10.2139/ssrn.3861821 fatcat:gav75rsk6nc2rmb7legigyorru

User Response Prediction in Online Advertising [article]

Zhabiz Gharibshah, Xingquan Zhu
2021 arXiv   pre-print
However, existing literature mainly focuses on algorithmic-driven designs to solve specific challenges, and no comprehensive review exists to answer many important questions.  ...  Recent years have witnessed a significant increase in the number of studies using computational approaches, including machine learning methods, for user response prediction.  ...  immediate advertising revenue and long-run user experience Joint optimization using two-level reinforcement learning Recomm.  ... 
arXiv:2101.02342v2 fatcat:clgefamcd5fmbeg5ephizy3zqu

A Tutorial on Thompson Sampling

Daniel J. Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband, Zheng Wen
2018 Foundations and Trends® in Machine Learning  
with neural networks, and reinforcement learning in Markov decision processes.  ...  Thompson sampling is an algorithm for online decision problems where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and  ...  Indeed, by sampling action 0 in the first period, the decision maker immediately learns the value of θ, and can exploit that knowledge to play the optimal action in all subsequent periods.  ... 
doi:10.1561/2200000070 fatcat:glgeh2ghmfaxzmzbxsm3zedipu

Active Learning: Approaches and Issues

T.R. Chaudhur, L.G.C. Hamey
1997 Journal of Intelligent Systems  
There are three major recognised approaches to the implementation of active learninggoal-driven learning, reinforcement learning and querying.  ...  This paper surveys published work in active learning research with the purpose of providing a unified understanding of the area.  ...  Learning regular sets from queries and examples,  ... 
doi:10.1515/jisys.1997.7.3-4.205 fatcat:stjvzotm7renhnty3ktfhzvvgm

Putting hands to rest: efficient deep CNN-RNN architecture for chemical named entity recognition with no hand-crafted rules

Ilia Korvigo, Maxim Holmatov, Anatolii Zaikovskii, Mikhail Skoblov
2018 Journal of Cheminformatics  
algorithms, such as deep neural networks, can automatically design the rules with little to none human intervention.  ...  Here we explored this approach by experimenting with various deep learning architectures for targeted tokenisation and named entity recognition.  ...  Design choices Deep-learning models.  ... 
doi:10.1186/s13321-018-0280-0 pmid:29796778 pmcid:PMC5966369 fatcat:47rjjknesragtjuwapu3jqiz6m

Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising [article]

Xiaotian Hao, Zhaoqing Peng, Yi Ma, Guan Wang, Junqi Jin, Jianye Hao, Shan Chen, Rongquan Bai, Mingzhou Xie, Miao Xu, Zhenzhe Zheng, Chuan Yu (+3 others)
2020 arXiv   pre-print
To improve the exploration efficiency of reinforcement learning, we also devise an effective action space reduction approach.  ...  In this paper, we formulate the sequential advertising strategy optimization as a dynamic knapsack problem.  ...  The MDP can be solved using existing Deep Reinforcement Learning (DRL) algorithms such as DQN (Mnih et al., 2013) , DDPG (Lillicrap et al., 2015) and PPO (Schulman et al., 2017) .  ... 
arXiv:2006.16312v1 fatcat:j2lh2tta3je6lc3z5jva2f5fee

Emergent Graphical Conventions in a Visual Communication Game [article]

Shuwen Qiu, Sirui Xie, Lifeng Fan, Tao Gao, Song-Chun Zhu, Yixin Zhu
2021 arXiv   pre-print
We devise a novel reinforcement learning method such that agents are evolved jointly towards successful communication and abstract graphical conventions.  ...  To inspect the emerged conventions, we carefully define three key properties -- iconicity, symbolicity, and semanticity -- and design evaluation methods accordingly.  ...  Learning to communicate with deep national Conference on Learning Representations (ICLR), multi-agent reinforcement learning.  ... 
arXiv:2111.14210v2 fatcat:e75iuvjbazbypcmasfr7ah4p6a

Putting hands to rest: efficient deep CNN-RNN architecture for chemical named entity recognition with no hand-crafted rules [article]

Ilia Korvigo, Maxim Holmatov, Anatolii Zaikovskii, Mikhail Skoblov
2018 bioRxiv   pre-print
algorithms, such as deep neural networks, can automatically design the rules with little to none human intervention.  ...  Here we explored this approach by experimenting with various deep learning architectures for targeted tokenisation and named entity recognition.  ...  Design choices Deep-learning models.  ... 
doi:10.1101/321224 fatcat:6w4immj2ynhjxe4l3r6lpazoi4

Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference [article]

Jianghao Shen, Yonggan Fu, Yue Wang, Pengfei Xu, Zhangyang Wang, Yingyan Lin
2020 arXiv   pre-print
While increasingly deep networks are still in general desired for achieving state-of-the-art performance, for many specific inputs a simpler network might already suffice.  ...  Existing works exploited this observation by learning to skip convolutional layers in an input-dependent manner.  ...  The work of SkipNet uses a hybrid learning algorithm that sequentially performs supervised pretraining and reinforcement fine-tuning, achieving better resource saving and accuracy tradeoff than existing  ... 
arXiv:2001.00705v1 fatcat:72xef5fl4bhwxcwdkprrrsqz4i

Interactive evolutionary computation: fusion of the capabilities of EC optimization and human evaluation

H. Takagi
2001 Proceedings of the IEEE  
In this survey, the IEC application fields include graphic arts and animation, 3-D CG lighting, music, editorial design, industrial design, facial image generation, speech processing and synthesis, hearing  ...  The IEC is an EC that optimizes systems based on subjective human evaluation. The definition and features of the IEC are first described and then followed by an overview of the IEC research.  ...  Acknowledgements This survey was completed with the help of several people who provided information on IEC papers, sent the papers, commented on this article, helped build a database, and prepared this  ... 
doi:10.1109/5.949485 fatcat:jwqoptm2zjeovolfhkrqsoqpfi

Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference

Jianghao Shen, Yue Wang, Pengfei Xu, Yonggan Fu, Zhangyang Wang, Yingyan Lin
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
While increasingly deep networks are still in general desired for achieving state-of-the-art performance, for many specific inputs a simpler network might already suffice.  ...  Existing works exploited this observation by learning to skip convolutional layers in an input-dependent manner.  ...  The work of SkipNet uses a hybrid learning algorithm that sequentially performs supervised pretraining and reinforcement fine-tuning, achieving better resource saving and accuracy tradeoff than existing  ... 
doi:10.1609/aaai.v34i04.6025 fatcat:mp66ycyvdzap7bnrdd2vegprpm

Relational Neurogenesis for Lifelong Learning Agents

Tej Pandit, Dhireesha Kudithipudi
2020 Proceedings of the Neuro-inspired Computational Elements Workshop  
The ability to learn through continuous reinforcement and interaction with an environment negates the requirement of painstakingly curated datasets and hand crafted features.  ...  However, the ability to learn multiple tasks in a sequential manner, referred to as lifelong or continual learning, remains unresolved.  ...  These networks were most uncreatively named Deep Q-Learning Networks (DQN), and the technique was called Deep Reinforcement the experiments carried out by DQN.  ... 
doi:10.1145/3381755.3381766 dblp:conf/nice/PanditK20 fatcat:flknsyjdprbdxg76upjzpdacju
« Previous Showing results 1 — 15 out of 1,387 results