106,189 Hits in 6.8 sec

Parameter-free online learning via model selection [article]

Dylan J. Foster, Satyen Kale, Mehryar Mohri, Karthik Sridharan
2018 arXiv   pre-print
We introduce an efficient algorithmic framework for model selection in online learning, also known as parameter-free online learning.  ...  Finally, we generalize these results by providing oracle inequalities for arbitrary non-linear classes in the online supervised learning model.  ...  Application 1: Parameter-free online learning in uniformly convex Banach spaces.  ... 
arXiv:1801.00101v2 fatcat:3oom3h6aa5d4zjiswqihlptzfy

Online Information-Aware Motion Planning with Inertial Parameter Learning for Robotic Free-Flyers [article]

Monica Ekal, Keenan Albee, Brian Coltin, Rodrigo Ventura, Richard Linares, David W. Miller
2021 arXiv   pre-print
The method consists of a two-tiered (global and local) planner, a low-level model predictive controller, and an online parameter estimator that produces estimates of the robot's inertial properties for  ...  Recognizing this, this work proposes RATTLE, an online information-aware motion planning algorithm that explicitly weights parametric model-learning coupled with real-time replanning capability that can  ...  The main contributions of this paper are: 1) RATTLE, a novel motion planning method for creating selectively information-aware plans with online parameter estimation to reduce parametric uncertainty; 2  ... 
arXiv:2112.05878v1 fatcat:or4vmvnai5a5tm6ispyt6riuty

Online Contrastive Divergence with Generative Replay: Experience Replay without Storing Data [article]

Decebal Constantin Mocanu and Maria Torres Vega and Eric Eaton and Peter Stone and Antonio Liotta
2016 arXiv   pre-print
Conceived in the early 1990s, Experience Replay (ER) has been shown to be a successful mechanism to allow online learning algorithms to reuse past experiences.  ...  Traditionally, ER can be applied to all machine learning paradigms (i.e., unsupervised, supervised, and reinforcement learning).  ...  The free parameters can then be updated afterwards via: ∆Θ = ∂F(v 0 ) ∂θ − ∂F(v n CD ) ∂θ , (4) yielding the following update rules for the free parameters of binary RBMs: ∆W ji ∝ v 0 i h 0 j − v n CD  ... 
arXiv:1610.05555v1 fatcat:iyuhxfjaijfj5jzmno5qtx4njq

Model-based choices involve prospective neural activity

Bradley B Doll, Katherine D Duncan, Dylan A Simon, Daphna Shohamy, Nathaniel D Daw
2015 Nature Neuroscience  
Decisions may arise via 'model-free' repetition of previously reinforced actions or by 'model-based' evaluation, which is widely thought to follow from prospective anticipation of action consequences using  ...  a learned map or model.  ...  Reprints and permissions information is available online at reprints/index.html.  ... 
doi:10.1038/nn.3981 pmid:25799041 pmcid:PMC4414826 fatcat:n7nsqqzgrreudnlib2swkwygw4

XMRF: An R package to Fit Markov Networks to High-Throughput Genetics Data [article]

Ying-Wooi Wan, Genevera I. Allen, Yulia Baker, Eunho Yang, Pradeep Ravikumar, Zhandong Liu
2015 bioRxiv   pre-print
data (counts via Poisson graphical models), mutation and copy number variation data (categorical via Ising models), and methylation data (continuous via Gaussian graphical models).  ...  Encoding the models and estimation techniques of the recently proposed exponential family Markov Random Fields (Yang et al., 2012), our software can be used to learn genetic networks from RNA-sequencing  ...  ,method="LPGM") to learn the same simulated scale-free network of 30 nodes from 200 observations along a path of 20 regularization parameters. > library(XMRF) > n = 200 > p = 30 # Simulate a scale-free  ... 
doi:10.1101/032219 fatcat:fj4zo6yiffebpmuahqdyei53ia

Cognitive Learning-Aided Multi-Antenna Communications [article]

Ahmet M. Elbir, Kumar Vijay Mishra
2021 arXiv   pre-print
Deep learning (DL) is critical in enabling essential features of cognitive systems because of its fast prediction performance, adaptive behavior, and model-free structure.  ...  There are research opportunities to address significant design challenges arising from insufficient data coverage, learning model complexity, and data transmission overheads.  ...  As the learning performance goes below a threshold, the OL provides only infrequent updates to the model parameters. 3) Reinforcement Learning: The OL requires online datasets to update the learning model  ... 
arXiv:2010.03131v2 fatcat:hc7iqwajwnbp7bel57yauvto6y

Bayes-Adaptive Deep Model-Based Policy Optimisation [article]

Tai Hoang, Ngo Anh Vien
2021 arXiv   pre-print
We introduce a Bayesian (deep) model-based reinforcement learning method (RoMBRL) that can capture model uncertainty to achieve sample-efficient policy optimisation.  ...  RoMBRL maintains model uncertainty via belief distributions through a deep Bayesian neural network whose samples are generated via stochastic gradient Hamiltonian Monte Carlo.  ...  Introduction The family of reinforcement learning (RL) algorithms consists of two major approaches: model-free and model-based [1] .  ... 
arXiv:2010.15948v3 fatcat:nf2m52ph5bashmiydmrmdxtmum

Machine Learning Techniques Applied to Human-Robot Collaboration

Jeyhoon Maskania, Loris Roveda
2019 Zenodo  
Such learned model is then used by a model predictive controller (MPC) with cross-entropy method (CEM) to online optimize the impedance control parameters (i.e., stiffness and damping parameters).  ...  An ensemble of neural networks (ANNs) is used to learn a human-robot interaction dynamics model, catching dynamics uncertainties.  ...  by a model predictive controller MPC to online optimize the impedance control parameters, i.e., stiffness and damping parameters).  ... 
doi:10.5281/zenodo.4793724 fatcat:4i4zqc5rrfbsrkrdd72o6dykjq

Gradient-based Editing of Memory Examples for Online Task-free Continual Learning [article]

Xisen Jin, Arka Sadhu, Junyi Du, Xiang Ren
2021 arXiv   pre-print
We explore task-free continual learning (CL), in which a model is trained to avoid catastrophic forgetting in the absence of explicit task boundaries or identities.  ...  Among many efforts on task-free CL, a notable family of approaches are memory-based that store and replay a subset of training examples.  ...  Online Task-free Continual Learning [3] is a specific formulation of the continual learning where the task boundaries and identities are not available to the model.  ... 
arXiv:2006.15294v3 fatcat:5gcf7iommjhb7knws2od3wcze4

Home Is Where the Up-Votes Are: Behavior Changes in Response to Feedback in Social Media [article]

Sanmay Das, Allen Lavoie
2014 arXiv   pre-print
The model allows us to make predictions, particularly in the context of social media, about which community a user will select, and to quantify how future selections change based on the feedback a user  ...  We introduce a quantitative model of behavior changes in response to such feedback, drawing on inverse reinforcement learning and studies of human game playing.  ...  Selection is governed by Hierarchical Dirichlet Process parameters α0 (scalar concentration parameter) and β (global community popularity vector), and by learning parameters φ (recency) and (exploration  ... 
arXiv:1406.7738v1 fatcat:jmvzdeegzfhqbhlw33m4eqaeca

A calibration-free method for biosensing in cell manufacturing [article]

Jialei Chen, Zhaonan Liu, Kan Wang, Chen Jiang, Chuck Zhang, Ben Wang
2020 arXiv   pre-print
Specifically, we model this variability via a patient-specific calibration parameter, and use readings from multiple biosensors to construct a patient-invariance statistic, thereby alleviating the effect  ...  A carefully formulated optimization problem and an algorithmic framework are presented to find the best patient-invariance statistic and the model parameters.  ...  In the online monitoring stage, viable cell concentrations can be recovered via the invariance statistic, free from the patient-specific calibration parameter.  ... 
arXiv:2007.14391v1 fatcat:oozrskbfang6vonjbk5pqazhqe

Dream to Control: Learning Behaviors by Latent Imagination [article]

Danijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi
2020 arXiv   pre-print
Learned world models summarize an agent's experience to facilitate learning complex behaviors.  ...  We efficiently learn behaviors by propagating analytic gradients of learned state values back through trajectories imagined in the compact state space of a learned world model.  ...  PlaNet learns the same world model as Dreamer and selects actions via online planning without an action model and drastically improves over D4PG and A3C in data efficiency.  ... 
arXiv:1912.01603v3 fatcat:o3765b6jtfbfnpqqfmf6v72p6i

Re-using prior tactile experience by robotic hands to discriminate in-hand objects via texture properties

Mohsen Kaboli, Rich Walker, Gordon Cheng
2016 2016 IEEE International Conference on Robotics and Automation (ICRA)  
This paper proposes an online tactile transfer learning strategy for discriminating objects through the surface texture properties via a robotic hand and an artificial robotic skin.  ...  The proposed method has the ability to autonomously select and exploit the previously learned multiple texture models while discriminating new textures with a very few available training samples or even  ...  (Prior Texture Model Selection) 3.1-Initializing the online learning algorithm with the constructed prior models 3.2-Constructing the new texture models while receiving new textures/objects 4-Updating  ... 
doi:10.1109/icra.2016.7487372 dblp:conf/icra/KaboliWC16 fatcat:3n5p64celjdxrn466ebek6tiu4

Style transfer matrix learning for writer adaptation

Xu-Yao Zhang, Cheng-Lin Liu
2011 CVPR 2011  
matrix are learned via online discriminative learning.  ...  Experiments on a large-scale Chinese online handwriting database demonstrate that STM learning can reduce recognition errors significantly, and the unsupervised adaptation model performs even better than  ...  We suggest to set β via equation (11) and selectβ from [0, 3].  ... 
doi:10.1109/cvpr.2011.5995661 dblp:conf/cvpr/ZhangL11 fatcat:iszsvrvan5feba7teb33k3g65e

Deep Learning-based Power Control for Cell-Free Massive MIMO Networks [article]

Nuwanthika Rajapaksha, K. B. Shashika Manosha, Nandana Rajatheva, Matti Latva-aho
2021 arXiv   pre-print
An online learning stage is also introduced, which results in near-optimal performance with 4-6 times faster processing.  ...  A deep learning (DL)-based power control algorithm that solves the max-min user fairness problem in a cell-free massive multiple-input multiple-output (MIMO) system is proposed.  ...  parameters in the model.  ... 
arXiv:2102.10366v1 fatcat:jvk544lygfauvb662haqx23g7q
« Previous Showing results 1 — 15 out of 106,189 results