A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is
We introduce SLM Lab, a software framework for reproducible reinforcement learning (RL) research. ... SLM Lab implements a number of popular RL algorithms, provides synchronous and asynchronous parallel experiment execution, hyperparameter search, and result analysis. ... SOFTWARE FOR REINFORCEMENT LEARNING To date more than twenty reinforcement-learning-themed open source software libraries have been released. ...arXiv:1912.12482v1 fatcat:xwfuzwxsp5agjgziaa2xjr5iyq
Among these is the term "Reinforcement", the applications of which, form the basis of this scoping review. ... or reducing something.⁸ Both types of reinforcement strengthen behavior, or increase the probability of a behavior reoccurring. ... A search of the literature on studies from Pakistan, showed the use of positive reinforcement during micro feedback sessions.⁴¹ Thus, we can see that reinforcement still holds much standing as a learning ...doi:10.36570/jduhs.2019.3.695 fatcat:mlcpvc6ravbdhhq5pfz4i3vzre
The results at positions 9 and 15 suggest that when the number of seman- tically related words searched is large enough to activate a semantic code, the recall probability is suddenly boosted and continues ... During stage 1, the dipper operated for 7.5-s periods at intervals of 30s for a total of 30 reinforcements per ses sion, for two sessions. ...doi:10.1038/264057a0 pmid:12471 fatcat:dvvjx5ub7fbnhelp2kkdmxr2eu
Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
In the last ten years, deep reinforcement learning (DRL) has become a promising direction for decision-making, since DRL utilizes the high model capacity of deep learning for complex decision-making tasks ... Information retrieval (IR) systems have become an essential component in modern society to help users find useful information, which consists of a series of processes including query expansion, item recall ... Xiangyu Zhao is partially supported by Start-up Grant (No.9610565) for the New Faculty of the City University of Hong Kong and the CCF-Tencent Open Fund. ...doi:10.1145/3477495.3531703 fatcat:5gmafvsikrb7njqhke4x2kmfou
Each experimental session con- sisted of a series of 7 VI schedules, providing reinforcement rates that varied between 20 to 1200 h™'. ... Mean escape latencies decreased from 47 sec to 16 sec during the 24 daily sessions. Another group of !0 male rats learned the same task in the light. ...
light), and then the effects of this training were assessed in Pavlovian-to-instrumental transfer (Experiment 1) and retardation-of-learning (Experiment 2) tests. ... In the present experiments, the outcome specificityof learning was explored in an appetitive Pavlovian backward conditioning procedure with rats. ... On each of 2 days, one magazine training session with 1 reinforcer was followed immediately by a second session with the alternative reinforcer. ...doi:10.3758/bf03196000 pmid:14733487 fatcat:ggx5ofoa5vc7nnffimjxlimfci
The system involves three learning problems: the selection of relevant markers regarding the searched category, the reinforcement of these markers and the learning of the relevance function. ... These markers are reinforced to match the distribution of relevant images over the network. We tackle the use of the information gathered during previous search sessions. ... This leads to a threefold learning problem : learning paths during the search session, merging paths learnt during previous search sessions, and learning the similarity function. ...doi:10.1016/j.ins.2010.03.003 fatcat:3mobrjcnvvgcpeggct6jqr6te4
We analyze those solutions from a theoretical point of view and evaluate them empirically on three Atari games from the Arcade Learning Environment. ... applying ML approaches to typical problems of specific domains. ... They also present an in-depth discussion of architecture search spaces and architecture optimization algorithms based on the principles of Reinforcement Learning and evolutionary algorithms. ...doi:10.3390/computers10010011 fatcat:5hbjofe62rc4pebu2wwnv2hlji
We thus tested a version of the previous model where elimination of non-rewarded target is done with a learning rate α fixed to 1that is, no degree of freedom in the learning rate in contrast with Model ... Overall the data support a role of dACC in integrating reinforcement-based information to regulate decision functions in LPFC. ... Conflict of Interest: None declared. ...doi:10.1093/cercor/bhu114 pmid:24904073 fatcat:qgvykrxlfffqxmxpmngdhg5plq
Days between sessions (M = 3) were held as consistent as possible given the constraints of conducting research on a working ranch and safety–threatening weather conditions. ... Total training time per session and total rest per session were held constant. ... Acknowledgements Portions of the data were presented at the April ...doi:10.1007/s10071-021-01580-7 pmid:34860336 pmcid:PMC9107396 fatcat:rshuntmd75ao7aa66w5rdkcbr4
It presents a new approach that involves unsupervised, reinforcement learning, and cooperation between agents. ... It indicates that combining different learning algorithms is capable of improving user satisfaction indicated by the percentage of precision, recall, the progressive category weight and F 1-measure. ... Conflict of Interest The authors have declared no conflict of interest. Compliance with Ethics Requirements This article does not contain any studies with human or animal subjects. ...doi:10.1016/j.jare.2015.06.005 pmid:26966569 pmcid:PMC4767809 fatcat:b2ak7jhchfdldo66mqpovjuwde
Journal of Vision
Reinforcement learning seems to serve as the mechanism to optimize search behavior with respect to the statistics of the task. ... The proportions of saccades meeting the reinforcement criteria increased considerably, and participants matched their search behavior to the relative reinforcement rates of targets. ... The application of such a rigid search-strategy led to an extremely low reward ratio, which impeded the learning of reinforcement contingencies. ...doi:10.1167/16.10.15 pmid:27559719 fatcat:savvu3dy75bapd3abmg7kvd7ty
(U Ken- tucky, Lexington) Asymmetrical coding of food and no-food events by pigeons: Sample pecking versus food as the basis of the sample code. Learning & Motivation, 1993(May), Vol 24(2), 141-155. ... (U Ken- tucky, Lexington) Coding of feature and no-feature events by pigeons performing a delayed conditional discrimination. Ani- mal Learning & Behavior, 1993(May), Vol 21(2), 92-100. ...
This extends the observation of intrinsic prediction error-like signals, driven by intrinsic rather than extrinsic reward, to memory-driven visual search. ... of uncertainty. ... Search times of the fMRI session. ...doi:10.2174/1874440001610010126 pmid:27867436 pmcid:PMC5101634 fatcat:3rpksxg2wnb6tbv4w6kzoe56gq
Educational Research and Reviews
follow-up supports to reinforce in-service learning, and in-service training and follow-up supports of sufficient duration and intensity to have discernible teacher and student effects. ... In-service professional development experts' contentions about the key characteristics and core features of effective in-service training were used to code and analyze the research reviews. ... ACKNOWLEDGEMENTS The preparation of the metasynthesis described in this paper was supported, in part, by funding from the U.S. Department of Education, Office of Special Education ...doi:10.5897/err2015.2306 fatcat:ana5ki63i5bxnf3rt5ls573pdu
« Previous Showing results 1 — 15 out of 67,627 results