883 Hits in 7.2 sec

ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement Learning [article]

Sean Chen, Jensen Gao, Siddharth Reddy, Glen Berseth, Anca D. Dragan, Sergey Levine
2022 arXiv   pre-print
We propose a hierarchical solution that learns efficiently from sparse user feedback: we use offline pre-training to acquire a latent embedding space of useful, high-level robot behaviors, which, in turn  ...  Building assistive interfaces for controlling robots through arbitrary, high-dimensional, noisy inputs (e.g., webcam images of eye gaze) can be challenging, especially when it involves inferring the user's  ...  ACKNOWLEDGEMENTS Thanks to members of the InterACT and RAIL labs at UC Berkeley for feedback on this project.  ... 
arXiv:2202.02465v1 fatcat:h6v75dr2rrbjlfiftjuqhiblju

Human-in-the-Loop Methods for Data-Driven and Reinforcement Learning Systems [article]

Vinicius G. Goecks
2020 arXiv   pre-print
This research investigates how to integrate these human interaction modalities to the reinforcement learning loop, increasing sample efficiency and enabling real-time reinforcement learning in robotics  ...  This can be attributed to the fact that current state-of-the-art, end-to-end reinforcement learning approaches still require thousands or millions of data samples to converge to a satisfactory policy and  ...  There have been many examples in the field of human-robot interaction where human interaction is used to train autonomous systems in the context of end-to-end reinforcement learning.  ... 
arXiv:2008.13221v1 fatcat:aofoenmwcvckvagbttrkskevty

Emergent Hand Morphology and Control from Optimizing Robust Grasps of Diverse Objects [article]

Xinlei Pan, Animesh Garg, Animashree Anandkumar, Yuke Zhu
2020 arXiv   pre-print
We develop a novel Bayesian Optimization algorithm that efficiently co-designs the morphology and grasping skills jointly through learned latent-space representations.  ...  Jointly optimizing morphology and control imposes computational challenges since it requires constant evaluation of a black-box function that measures the performance of a combination of embodiment and  ...  ACKNOWLEDGMENT We gratefully acknowledge the feedback from members in NVIDIA AI Algorithms research team. We also acknowledge Jonathan Tremblay from NVIDIA for the support on ViSII rendering.  ... 
arXiv:2012.12209v1 fatcat:bow2wkpywfbrvnscnsftabh74e

Co-Learning of Task and Sensor Placement for Soft Robotics

Andrew Spielberg, Alexander Amini, Lillian Chin, Wojciech Matusik, Daniela Rus
2021 IEEE Robotics and Automation Letters  
Index Terms-Soft robot materials and design, soft sensors and actuators, modeling, control, and learning for soft robots, deep learning methods.  ...  In this work, we present a novel representation for co-learning sensor placement and complex tasks.  ...  Index Terms-Soft robot materials and design, soft sensors and actuators, modeling, control, and learning for soft robots, deep learning methods. I.  ... 
doi:10.1109/lra.2021.3056369 fatcat:cjbigjlo6ncijkm4upqtwkw7bm

Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks [article]

Runpei Dong, Zhanhong Tan, Mengdi Wu, Linfeng Zhang, Kaisheng Ma
2022 arXiv   pre-print
In this paper, we present an adaptive-mapping quantization method to learn an optimal latent sub-distribution that is inherent within models and smoothly approximated with a concrete Gaussian Mixture (  ...  This sub-distribution evolves along with the weight update in a co-tuning schema guided by the direct task-objective optimization.  ...  The scheduling also employs loop tiling and unrolling (not shown in the figure). kw, kh: width and height of kernel. ci, co: input and output channel. ow, oh: width and height of output feature map.  ... 
arXiv:2112.15139v3 fatcat:weqti67lrndfznkulmdpwhqf6i

RLOC: Terrain-Aware Legged Locomotion using Reinforcement Learning and Optimal Control [article]

Siddhant Gangapurwala, Mathieu Geisert, Romeo Orsolino, Maurice Fallon, Ioannis Havoutis
2020 arXiv   pre-print
When ran online, the system tracks the generated footstep plans using a model-based controller. We evaluate the robustness of our method over a wide variety of complex terrains.  ...  We utilize on-board proprioceptive and exteroceptive feedback to map sensory information and desired base velocity commands into footstep plans using a reinforcement learning (RL) policy trained in simulation  ...  Additionally, as presented in Section V-A, training the footstep planner using deep RL enables natural emergence of planning behavior learned through interactions with the environment of operation.  ... 
arXiv:2012.03094v1 fatcat:ynp2g3ng6rbe3jsmqopz6rrwu4

Go with the Flow: Adaptive Control for Neural ODEs [article]

Mathieu Chalvidal, Matthew Ricci, Rufin VanRullen, Thomas Serre
2021 arXiv   pre-print
Here, we describe a new module called neurally controlled ODE (N-CODE) designed to improve the expressivity of NODEs.  ...  The parameters of N-CODE modules are dynamic variables governed by a trainable map from initial or current activation state, resulting in forms of open-loop and closed-loop control, respectively.  ...  Preciado, and George J. Pappas. Robust deep learn- ing as optimal control: Insights and convergence guarantees. Proceedings of Machine Learning Research, 2020. Jianbo Shi and Jitendra Malik.  ... 
arXiv:2006.09545v3 fatcat:fl7i2tyqebh5dkvmduqjsl3xqq

Table of Contents

2020 IEEE Robotics and Automation Letters  
Hirata 6459 End-to-End Tactile Feedback Loop: From Soft Sensor Skin Over Deep GRU-Autoencoders to Tactile Stimulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ...  Ruiz-del-Solar 5787 sEMG-Based Human-in-the-Loop Control of Elbow Assistive Robots for Physical Tasks and Muscle Strength Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ... 
doi:10.1109/lra.2020.3030731 fatcat:kwx4xyitfbfuzgugbi5vavx2xu

2020 Index IEEE Robotics and Automation Letters Vol. 5

2020 IEEE Robotics and Automation Letters  
End-to-End Tactile Feedback Loop: From Soft Sensor Skin Over Deep GRU-Autoencoders to Tactile Stimulation.  ...  ., +, LRA July 2020 3990-3997 End-to-End Tactile Feedback Loop: From Soft Sensor Skin Over Deep GRU-Autoencoders to Tactile Stimulation.  ... 
doi:10.1109/lra.2020.3032821 fatcat:qrnouccm7jb47ipq6w3erf3cja

Interactive Differentiable Simulation [article]

Eric Heiden, David Millard, Hejia Zhang, Gaurav S. Sukhatme
2020 arXiv   pre-print
We present experiments showing automatic task-based robot design and parameter estimation for nonlinear dynamical systems by automatically calculating gradients in IDS.  ...  Intelligent agents need a physical understanding of the world to predict the impact of their actions in the future.  ...  Learning dynamics models has a tradition in the field of robotics and control theory.  ... 
arXiv:1905.10706v3 fatcat:3d7572kkb5fljj3mddovpe4zvu

Advanced soft robot modeling in ChainQueen

Andrew Spielberg, Tao Du, Yuanming Hu, Daniela Rus, Wojciech Matusik
2021 Robotica (Cambridge. Print)  
Previous work established ChainQueen as a powerful tool for inference, control, and co-design for soft robotics.  ...  We demonstrate the power of our simulator extensions in over nine simulated experiments.  ...  To view supplementary material for this article, please visit S0263574721000722  ... 
doi:10.1017/s0263574721000722 fatcat:zgtenagcurel5ci26z6qkn4foe

Development of a robust cascaded architecture for intelligent robot grasping using limited labelled data [article]

Priya Shukla, Vandana Kushwaha, G. C. Nandi
2021 arXiv   pre-print
To the best of our knowledge, developing an intelligent robot grasping model (based on semi-supervised learning) trained through representation learning and exploiting the high-quality learning ability  ...  In the case of robots, we can not afford to spend that much time on making it to learn how to grasp objects effectively.  ...  to learn how ing) trained through representation learning and ex- to grasp objects correctly.  ... 
arXiv:2112.03001v1 fatcat:elnqwsytbbdtxdnzpphxkxbtrq

Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks [article]

Jingjing Wang and Chunxiao Jiang and Haijun Zhang and Yong Ren and Kwang-Cheng Chen and Lajos Hanzo
2020 arXiv   pre-print
Hence, in this article, we review the thirty-year history of ML by elaborating on supervised learning, unsupervised learning, reinforcement learning and deep learning.  ...  Future wireless networks have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate  ...  The CNN model learned the relevant features through self-optimization during the GPU based training process, which was first designed in [323] .  ... 
arXiv:1902.01946v2 fatcat:7bveg6rmjfga5mftdkr3mst2qa

Policy Search for Model Predictive Control with Application to Agile Drone Flight [article]

Yunlong Song, Davide Scaramuzza
2021 arXiv   pre-print
Policy Search and Model Predictive Control~(MPC) are two different paradigms for robot control: policy search has the strength of automatically learning complex policies using experienced data, while MPC  ...  An open research question is how to leverage and combine the advantages of both approaches.  ...  ACKNOWLEDGMENT We thank Thomas Längle, Roberto Tazzari, Manuel Sutter, Elia Kaufmann, Antonio Loquercio, Philipp Foehn, Angel Romero, and Sihao Sun for their help or the valuable discussions.  ... 
arXiv:2112.03850v2 fatcat:65fldchqxrgzropf5k7fxe5uke

Optimizing Interactive Systems with Data-Driven Objectives

Ziming Li
2019 Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence  
It is promising if we model the objectives directly from the user interactions which we use to optimize interactive systems, which will improve user experience and dynamically reacts to user actions.  ...  Generally, such objectives are manually crafted and rarely capture complex user needs in an accurate manner. We propose to infer the objective directly from observed user interactions.  ...  Following Assumption 3, system designers have control over the sets S, A, and the transition distribution, T , and T can be changed to optimize an interactive system.  ... 
doi:10.24963/ijcai.2019/912 dblp:conf/ijcai/Li19 fatcat:km5f6bkjpjbenbpisskfm6iwta
« Previous Showing results 1 — 15 out of 883 results