23,004 Hits in 4.2 sec

Conservative Objective Models for Effective Offline Model-Based Optimization [article]

Brandon Trabucco, Aviral Kumar, Xinyang Geng, Sergey Levine
2021 arXiv   pre-print
To overcome this, we propose conservative objective models (COMs), a method that learns a model of the objective function that lower bounds the actual value of the ground-truth objective on out-of-distribution  ...  In this paper, we aim to solve data-driven model-based optimization (MBO) problems, where the goal is to find a design input that maximizes an unknown objective function provided access to only a static  ...  Acknowledgements We thank anonymous ICML reviewers, Aurick Zhou and Justin Fu for discussions and feedback on the tasks and the method in this paper, and all other members from RAIL at UC Berkeley for  ... 
arXiv:2107.06882v1 fatcat:xjysqud46jbczboei62usv6rdy

RoMA: Robust Model Adaptation for Offline Model-based Optimization [article]

Sihyun Yu, Sungsoo Ahn, Le Song, Jinwoo Shin
2021 arXiv   pre-print
To handle the issue, we propose a new framework, coined robust model adaptation (RoMA), based on gradient-based optimization of inputs over the DNN.  ...  Specifically, it consists of two steps: (a) a pre-training strategy to robustly train the proxy model and (b) a novel adaptation procedure of the proxy model to have robust estimates for a specific set  ...  Related work Offline model-based optimization. Offline model-based optimization (MBO) has recently gained interest for objective functions that are expensive or dangerous to evaluate.  ... 
arXiv:2110.14188v1 fatcat:r5vw4rniozdrfehftjkmdrr6pa

An adaptive robust optimization scheme for water-flooding optimization in oil reservoirs using residual analysis * *The authors acknowledge financial support from the Recovery Factory program sponsored by Shell Global Solutions International

M. Mohsin Siraj, Paul M.J. Van den Hof, Jan Dirk Jansen
2017 IFAC-PapersOnLine  
Model-based dynamic optimization of the water-flooding process in oil reservoirs is a computationally complex problem and suffers from high levels of uncertainty.  ...  These models are generally not validated with data and the resulting robust optimization strategies are mostly offline or open-loop.  ...  Another indicator for the effect of uncertainty is the worst-case value.  ... 
doi:10.1016/j.ifacol.2017.08.1632 fatcat:6uacc2gprzgd3cri3smtmplfry

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems [article]

Sergey Levine, Aviral Kumar, George Tucker, Justin Fu
2020 arXiv   pre-print
Offline reinforcement learning algorithms hold tremendous promise for making it possible to turn large datasets into powerful decision making engines.  ...  Effective offline reinforcement learning methods would be able to extract policies with the maximum possible utility out of the available data, thereby allowing automation of a wide range of decision-making  ...  Conservative model-based RL algorithms.  ... 
arXiv:2005.01643v3 fatcat:kyw5xc4dijgz3dpuytnbcrmlam

Data-Driven Offline Optimization For Architecting Hardware Accelerators [article]

Aviral Kumar, Amir Yazdanbakhsh, Milad Hashemi, Kevin Swersky, Sergey Levine
2022 arXiv   pre-print
In addition, PRIME also architects effective accelerators for unseen applications in a zero-shot setting, outperforming simulation-based methods by 1.26x.  ...  In this paper, we develop such a data-driven offline optimization method for designing hardware accelerators, dubbed PRIME, that enjoys all of these properties.  ...  ACKNOWLEDGEMENTS We thank the "Learn to Design Accelerators" team at Google Research and the Google EdgeTPU team for their invaluable feedback and suggestions.  ... 
arXiv:2110.11346v3 fatcat:ldkyj4enqbh5posziuozl5tj74

Offline Reinforcement Learning with Reverse Model-based Imagination [article]

Jianhao Wang, Wenzhe Li, Haozhe Jiang, Guangxiang Zhu, Siyuan Li, Chongjie Zhang
2021 arXiv   pre-print
These reverse imaginations provide informed data augmentation for model-free policy learning and enable conservative generalization beyond the offline dataset.  ...  ROMI can effectively combine with off-the-shelf model-free algorithms to enable model-based generalization with proper conservatism.  ...  Acknowledgments and Disclosure of Funding The authors would like to thank the anonymous reviewers, Zhizhou Ren, Kun Xu, and Hang Su for valuable and insightful discussions and helpful suggestions.  ... 
arXiv:2110.00188v2 fatcat:n5k3r3kwknglxg6w2lz3tcyhae

Surgical Scheduling via Optimization and Machine Learning with Long-Tailed Data [article]

Yuan Shi, Saied Mahdian, Jose Blanchet, Peter Glynn, Andrew Y. Shin, David Scheinker
2022 arXiv   pre-print
Compared to the current paper-based system used in the hospital, most optimization models failed to reduce congestion without increasing wait times for surgery.  ...  A conservative stochastic optimization with sufficient sampling to capture the long tail of the LOS distribution outperformed the current manual process.  ...  We thus opted for classification-based models as a simpler alternative to regression models for LOS estimation.  ... 
arXiv:2202.06383v1 fatcat:jj3xb7wk2fd7bcn72rmr5yg6kq

Reliable Offline Model-based Optimization for Industrial Process Control [article]

Cheng Feng, Jinyan Guan
2022 arXiv   pre-print
2) how to learn a reliable but not over-conservative control policy from offline data by utilizing existing model-based optimization algorithms?  ...  In the research area of offline model-based optimization, novel and promising methods are frequently developed.  ...  Each CGAN model is trained with 3000 epochs.  ... 
arXiv:2205.07250v1 fatcat:f2qwxlcoqbcghjlp3frzgkps7e

Model-Based Offline Planning with Trajectory Pruning [article]

Xianyuan Zhan, Xiangyu Zhu, Haoran Xu
2022 arXiv   pre-print
The model-based planning framework provides an attractive alternative. However, most model-based planning algorithms are not designed for offline settings.  ...  Experimental results show that MOPP provides competitive performance compared with existing model-based offline planning and RL approaches.  ...  Most model-based planning methods are designed for online settings.  ... 
arXiv:2105.07351v3 fatcat:sl24a3eh5fayblgx56my3oe3w4

Integration of Numerical Simulation and Control Scheme for Energy Conservation of Aluminum Melting Furnaces

Na Guo, Hongyu Zheng, Tao Zou, Yang Jia
2019 IEEE Access  
For numerical simulation, a nonlinear steady-state optimization is performed offline to obtain optimal operating points.  ...  For control scheme, a two-layer model predictive control which consists of steady state target calculation and dynamic optimization is developed to track the optimal operating conditions.  ...  The energy conservation technology for melting furnaces can be classified into hardware-based strategy and software-based strategy.  ... 
doi:10.1109/access.2019.2934187 fatcat:thlwd53odbhi5fnfpc356gjz24

A Workflow for Offline Model-Free Robotic Reinforcement Learning [article]

Aviral Kumar, Anikait Singh, Stephen Tian, Chelsea Finn, Sergey Levine
2021 arXiv   pre-print
Although offline RL methods can learn from prior data, there is no clear and well-understood process for making various design choices, from model architecture to algorithm hyperparameters, without actually  ...  Our workflow is derived from a conceptual understanding of the behavior of conservative offline RL algorithms and cross-validation in supervised learning.  ...  setup as well as for providing us with offline datasets we could test our workflow on.  ... 
arXiv:2109.10813v2 fatcat:bt5kt23fgfcxbblzc4464hrsf4

A Species Conservation-Based Particle Swarm Optimization with Local Search for Dynamic Optimization Problems

Dingcai Shen, Bei Qian, Min Wang
2020 Computational Intelligence and Neuroscience  
The experimental results show the effectiveness and efficiency of the proposed algorithm for tracking the moving optima in dynamic environments.  ...  To address this requirement, a species conservation-based particle swarm optimization (PSO), combined with a spatial neighbourhood best searching technique, is proposed.  ...  In this work, a species conservation-based PSO combined with a spatial neighbourhood best searching is proposed for DOPs.  ... 
doi:10.1155/2020/2815802 pmid:32802025 pmcid:PMC7416227 fatcat:lt4hljvmfvdbnn3zonvo5xidga

COMBO: Conservative Offline Model-Based Policy Optimization [article]

Tianhe Yu, Aviral Kumar, Rafael Rafailov, Aravind Rajeswaran, Sergey Levine, Chelsea Finn
2022 arXiv   pre-print
Model-based algorithms, which learn a dynamics model from logged experience and perform some sort of pessimistic planning under the learned model, have emerged as a promising paradigm for offline reinforcement  ...  Through experiments, we find that COMBO consistently performs as well or better as compared to prior offline model-free and model-based methods on widely studied offline RL benchmarks, including image-based  ...  Acknowledgments and Disclosure of Funding We thank members of RAIL and IRIS for their support and feedback.  ... 
arXiv:2102.08363v2 fatcat:azvca4wb65gc5aypjpidqphgzi

An Efficient Robust Optimization Workow using Multiscale Simulation and Stochastic Gradients

Rafael J. de Moraes, Rahul-Mark Fonseca, Mircea A. Helici, Arnold W. Heemink, Jan Dirk Jansen
2018 Journal of Petroleum Science and Engineering  
In the workflow, the construction of the basis fuctions is performed at an offline stage and they are not reconstructed/updated throughout the optimization process.  ...  The combination of speed and accuracy of MS forward simulation with the flexibility of the StoSAG technique allows for a flexible and efficient optimization workflow suitable for large-scale problems.  ...  Hadi Hajibeygi for useful discussions related to multiscale simulation.  ... 
doi:10.1016/j.petrol.2018.09.047 fatcat:mduxcpqku5dntnrubi52scxuau

Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization [article]

Tatsuya Matsushima, Hiroki Furuta, Yutaka Matsuo, Ofir Nachum, Shixiang Gu
2020 arXiv   pre-print
We propose a novel model-based algorithm, Behavior-Regularized Model-ENsemble (BREMEN) that can effectively optimize a policy offline using 10-20 times fewer data than prior works.  ...  We observe that naïvely applying existing model-free offline RL algorithms recursively does not lead to a practical deployment-efficient and sample-efficient algorithm.  ...  Acknowledgments We thank Yusuke Iwasawa, Emma Brunskill, Lihong Li, Sergey Levine, and George Tucker for insightful comments and discussion.  ... 
arXiv:2006.03647v2 fatcat:36zg5vzq2zgxtjyu54z2jno4ki
« Previous Showing results 1 — 15 out of 23,004 results