13,604 Hits in 3.7 sec

Punishment insensitivity in humans is due to failures in instrumental contingency learning

Philip Jean-Richard-dit-Bressel, Jessica C Lee, Shi Xian Liew, Gabrielle Weidemann, Peter F Lovibond, Gavan P McNally
2021 eLife  
Sensitive and insensitive individuals equally liked reward and showed similar rates of reward-seeking.  ...  Punishment maximises the probability of our individual survival by reducing behaviours that cause us harm, and also sustains trust and fairness in groups essential for social cohesion.  ...  The key advantages of this task were that multiple, competing explanations of punishment sensitivity could be assessed concurrently and more directly than previous studies, and could be mapped using a  ... 
doi:10.7554/elife.69594 pmid:34085930 fatcat:kex5xroswjep3h55qh2nr6f6pe

Inferring agent objectives at different scales of a complex adaptive system [article]

Dieter Hendricks, Adam Cobb, Richard Everett, Jonathan Downing and Stephen J. Roberts
2017 arXiv   pre-print
Given scale-specific temporal state trajectories and action sequences estimated from aggregate market behaviour, we use Inverse Reinforcement Learning to compute the effective reward function for the aggregate  ...  agent class at each scale, allowing us to assess the relative attractiveness of feature vectors across different scales.  ...  This could lead to a hierarchical reinforcement learning framework with multi-scale learning in financial markets to exploit hierarchy of causality at different scales.  ... 
arXiv:1712.01137v1 fatcat:cygotzx7abextjbxbxhyewb7zq

Gaussian process-based algorithmic trading strategy identification

Steve Y. Yang, Qifeng Qiao, Peter A. Beling, William T. Scherer, Andrei A. Kirilenko
2015 Quantitative finance (Print)  
We infer the reward (or objective) function for this process from observations of trading actions using a process from machine learning known as inverse reinforcement learning (IRL).  ...  The reward functions learned through IRL then constitute a feature space that can be the basis for supervised learning (for classification or recognition of traders) or unsupervised learning (for categorization  ...  (a) Hierarchical clustering in the LNIRL reward space. (b) Hierarchical clustering in the GPIRL reward space.  ... 
doi:10.1080/14697688.2015.1011684 fatcat:b7sdwvumojal5hhoqn3j6pc4m4

From lists of behaviour change techniques (BCTs) to structured hierarchies: Comparison of two methods of developing a hierarchy of BCTs

James Cane, Michelle Richardson, Marie Johnston, Ruhina Ladha, Susan Michie
2014 British Journal of Health Psychology  
The 'bottom-up' structure was examined for higher-order groupings using a dendrogram derived from hierarchical cluster analysis.  ...  Behaviour change technique (BCT) Taxonomy v1 is a hierarchically grouped, consensus-based taxonomy of 93 BCTs for reporting intervention content.  ...  To increase the usability and speed of recall of BCTs, BCT Taxonomy v1 was organized hierarchically using an open-sort task and hierarchical cluster analysis (HCA).  ... 
doi:10.1111/bjhp.12102 pmid:24815766 fatcat:guo36wp5grea7iss74dxeg2xue

The Monash Autism-ADHD genetics and neurodevelopment (MAGNET) project design and methodologies: a dimensional approach to understanding neurobiological and genetic aetiology

Rachael Knott, Beth P. Johnson, Jeggan Tiego, Olivia Mellahn, Amy Finlay, Kathryn Kallady, Maria Kouspos, Vishnu Priya Mohanakumar Sindhu, Ziarih Hawi, Aurina Arnatkeviciute, Tracey Chau, Dalia Maron (+15 others)
2021 Molecular Autism  
of symptoms and behaviours; investigate the degree of familiality for different dimensional ASD-ADHD phenotypes and clusters; and map the neurocognitive, brain imaging, and genetic correlates of these  ...  Using a comprehensive phenotyping protocol capturing dimensional traits central to ASD and ADHD, the MAGNET project aims to identify data-driven clusters across ADHD-ASD spectra using deep phenotyping  ...  Clustering: The MAGNET Project will use both supervised and unsupervised methods for discovery of ASD-ADHD clusters using measures of symptoms and behaviours.  ... 
doi:10.1186/s13229-021-00457-3 pmid:34353377 pmcid:PMC8340366 fatcat:23td4r2x2ngwvok5v2nm7fhivq

Human-level performance in first-person multiplayer games with population-based deep reinforcement learning [article]

Max Jaderberg, Wojciech M. Czarnecki, Iain Dunning, Luke Marris, Guy Lever, Antonio Garcia Castaneda, Charles Beattie, Neil C. Rabinowitz, Ari S. Morcos, Avraham Ruderman, Nicolas Sonnerat, Tim Green (+6 others)
2018 arXiv   pre-print
Each agent in the population learns its own internal reward signal to complement the sparse delayed reward from winning, and selects actions using a novel temporally hierarchical representation that enables  ...  However, the real-world contains multiple agents, each learning and acting independently to cooperate and compete with other agents, and environments reflecting this degree of complexity remain an open  ...  The resulting components were treated as behavioural clusters, letting us characterise a two second clip of CTF gameplay as one belonging to one of 32 behavioural clusters.  ... 
arXiv:1807.01281v1 fatcat:cgl4wrrjvjemjolgv2gj2urqkq

Hierarchical principles of embodied reinforcement learning: A review [article]

Manfred Eppe, Christian Gumbsch, Matthias Kerzel, Phuong D.H. Nguyen, Martin V. Butz, Stefan Wermter
2020 arXiv   pre-print
Among the most promising computational approaches to provide comparable learning-based problem-solving abilities for artificial agents and robots is hierarchical reinforcement learning.  ...  We expect our results to guide the development of more sophisticated cognitively inspired hierarchical methods, so that future artificial agents achieve a problem-solving performance on the level of intelligent  ...  Another prominent intrinsic reward model that is commonly used in non-hierarchical reinforcement learning is based on surprise and curiosity 90, 91, 107, 109 .  ... 
arXiv:2012.10147v1 fatcat:dfkdehyz2rggtimmlcmtvycpxe

Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies [article]

Dushyant Rao, Fereshteh Sadeghi, Leonard Hasenclever, Markus Wulfmeier, Martina Zambelli, Giulia Vezzani, Dhruva Tirumala, Yusuf Aytar, Josh Merel, Nicolas Heess, Raia Hadsell
2022 arXiv   pre-print
We propose an approach to learn abstract motor skills from data using a hierarchical mixture latent variable model.  ...  We demonstrate in manipulation domains that the method can effectively cluster offline data into distinct, executable behaviours, while retaining the flexibility of a continuous latent variable model.  ...  We analyse the learned skills to show that they effectively cluster data into distinct, interpretable behaviours.  ... 
arXiv:2112.05062v2 fatcat:mudcv6wdo5a33lri3wua6qpdvq

Bayesian Nonparametric Multi-Optima Policy Search in Reinforcement Learning

Danilo Bruno, Sylvain Calinon, Darwin Caldwell
This problem is addressed in this paper within the framework of Reinforcement Learning, as the automatic determination of multiple optimal parameterized policies.  ...  In this case, the knowledge of multiple solutions can avoid relearning the task.  ...  Another recent approach (Daniel, Neumann, and Peters 2012) uses hierarchical policy learning to treat the different policies as multiple possible options for the given task.  ... 
doi:10.1609/aaai.v27i1.8542 fatcat:2est74ds3fd4fc4dxvgpmlsgyu

ToyArchitecture: Unsupervised learning of interpretable models of the environment

Jaroslav Vítků, Petr Dluhoš, Joseph Davidson, Matěj Nikl, Simon Andersson, Přemysl Paška, Jan Šinkora, Petr Hlubuček, Martin Stránský, Martin Hyben, Martin Poliak, Jan Feyereisl (+2 others)
2020 PLoS ONE  
This architecture incorporates the unsupervised learning of a model of the environment, learning the influence of one's own actions, model-based reinforcement learning, hierarchical planning, and symbolic  ...  The learned model is stored in the form of hierarchical representations which are increasingly more abstract, but can retain details when needed.  ...  At the beginning of the simulation, only several cluster centers are used, the Temporal Pooler learns transitions between currently used clusters.  ... 
doi:10.1371/journal.pone.0230432 pmid:32421693 fatcat:kl4kbezycrayfbgqp3pgt5xcau

The hippocampal formation as a hierarchical generative model supporting generative replay and continual learning [article]

Ivilin Stoianov, Domenico Maisto, Giovanni Pezzulo
2020 bioRxiv   pre-print
Our experiments show that the hierarchical model using generative replay is able to learn and retain efficiently multiple spatial navigation trajectories, organizing them into separate spatial maps.  ...  of multiple sequential experiences.  ...  The GEFORCE Titan GPU card used for this research was donated by the NVIDIA Corp.  ... 
doi:10.1101/2020.01.16.908889 fatcat:bqkoe26xyzel5ihrhmwis57sne

Detecting and Responding to Concept Drift in Business Processes

Lingkai Yang, Sally McClean, Mark Donnelly, Kevin Burke, Kashaf Khan
2022 Algorithms  
Concept drift, which refers to changes in the underlying process structure or customer behaviour over time, is inevitable in business processes, causing challenges in ensuring that the learned model is  ...  a proper representation of the new data.  ...  Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/a15050174 fatcat:3n4gnl33dfeatok6r6ofvzg42a

Exploring Clustering-Based Reinforcement Learning for Personalized Book Recommendation in Digital Library

Xinhua Wang, Yuchen Wang, Lei Guo, Liancheng Xu, Baozhong Gao, Fangai Liu, Wei Li
2021 Information  
Furthermore, to overcome the sparsity issue of students' borrowing behaviours, a clustering-based reinforcement learning algorithm is further developed.  ...  Moreover, due to the the lack of direct supervision information, we treat noise filtering in sequences as a decision-making process and innovatively introduce a reinforcement learning method as our recommendation  ...  The hierarchical reinforcement learning model can effectively deal with sparse rewards, so it can be effectively used in the recommendation system.  ... 
doi:10.3390/info12050198 doaj:4ac92bb218cf469687cb00f7ad904f24 fatcat:hinnymrc2vc3jo47ipamujw6c4

Hierarchical Text Generation and Planning for Strategic Dialogue [article]

Denis Yarats, Mike Lewis
2018 arXiv   pre-print
Experiments show that our approach increases the end-task reward achieved by the model, improves the effectiveness of long-term planning using rollouts, and allows self-play reinforcement learning to improve  ...  We then use these latent sentence representations for hierarchical language generation, planning and reinforcement learning.  ...  Interestingly, we found this behaviour only happened with the models using rollouts.  ... 
arXiv:1712.05846v2 fatcat:dulmb7uurzbu5h5sq5boklg2fy

ToyArchitecture: Unsupervised Learning of Interpretable Models of the World [article]

Jaroslav Vítků, Petr Dluhoš, Joseph Davidson, Matěj Nikl, Simon Andersson, Přemysl Paška, Jan Šinkora, Petr Hlubuček, Martin Stránský, Martin Hyben, Martin Poliak, Jan Feyereisl (+1 others)
2019 arXiv   pre-print
In this paper, we present a novel, purposely simple, and interpretable hierarchical architecture which combines multiple different mechanisms into one system: unsupervised learning of a model of the world  ...  This enables the architecture to learn efficiently using sub-symbolic methods and to employ symbolic inference.  ...  At the beginning of the simulation, only several cluster centers are used, the Temporal Pooler learns transitions between currently used clusters.  ... 
arXiv:1903.08772v2 fatcat:wnknrw73pfhnpi6zy35pecriom
« Previous Showing results 1 — 15 out of 13,604 results