Filters








16,904 Hits in 9.4 sec

Offline-Online Reinforcement Learning for Energy Pricing in Office Demand Response: Lowering Energy and Data Costs [article]

Doseok Jang, Lucas Spangher, Manan Khattar, Utkarsha Agwan, Selvaprabuh Nadarajah, Costas Spanos
<span title="2021-08-14">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We present two approaches to doing so: pretraining our model to warm start the experiment with simulated tasks, and using a planning model trained to simulate the real world's rewards to the agent.  ...  We present results that demonstrate the utility of offline reinforcement learning to efficient price-setting in the energy demand response problem.  ...  This work is supported by the Republic of Singapore's National Research Foundation through a grant to the Berkeley Education Alliance for Research in Singapore (BEARS) for the Singapore-Berkeley Building  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2108.06594v1">arXiv:2108.06594v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/hy3l5fclgbgezp2aqkwvtzsdx4">fatcat:hy3l5fclgbgezp2aqkwvtzsdx4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210825081029/https://arxiv.org/pdf/2108.06594v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/b8/57/b857375c1b8bedca0ab1922a4b100e65410513fe.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2108.06594v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Online Multimodal Transportation Planning using Deep Reinforcement Learning [article]

Amirreza Farahani, Laura Genga, Remco Dijkman
<span title="2021-05-18">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
In this paper we propose a Deep Reinforcement Learning approach to solve a multimodal transportation planning problem, in which containers must be assigned to a truck or to trains that will transport them  ...  While traditional planning methods work "offline" (i.e., they take decisions for a batch of containers before the transportation starts), the proposed approach is "online", in that it can take decisions  ...  INTRODUCTION This paper introduces an online planning algorithm that we developed for a logistics company, based on Deep Reinforcement Learning (DRL).  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2105.08374v1">arXiv:2105.08374v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/hf4nq2xcnrh5hbgdvfsjjhrvpi">fatcat:hf4nq2xcnrh5hbgdvfsjjhrvpi</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210520145028/https://arxiv.org/pdf/2105.08374v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/83/0a/830a2d16cbea76479af0a6324e7f68cab1611b5e.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2105.08374v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

The Challenges of Exploration for Offline Reinforcement Learning [article]

Nathan Lambert, Markus Wulfmeier, William Whitney, Arunkumar Byravan, Michael Bloesch, Vibhavari Dasagi, Tim Hertweck, Martin Riedmiller
<span title="2022-02-19">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Offline Reinforcement Learning (ORL) enablesus to separately study the two interlinked processes of reinforcement learning: collecting informative experience and inferring optimal behaviour.  ...  With Explore2Offline, we propose to evaluate the quality of collected data by transferring the collected data and inferring policies with reward relabelling and standard offline RL algorithms.  ...  Offline Reinforcement Learning With Offline Reinforcement Learning, we decouple the learning mechanism from exploration by training agents from fixed datasets.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2201.11861v2">arXiv:2201.11861v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/sajzyrnxuze6lo2lozj4szy4um">fatcat:sajzyrnxuze6lo2lozj4szy4um</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220521190925/https://arxiv.org/pdf/2201.11861v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/09/da/09da56cd3bf72b632c43969be97874fa14a3765c.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2201.11861v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

A Meta Reinforcement Learning-based Approach for Self-Adaptive System [article]

Mingyue Zhang, Jialong Li, Haiyan Zhao, Kenji Tei, Shinichi Honiden, Zhi Jin
<span title="2021-05-11">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
In addition, it designs a meta-reinforcement learning algorithm for learning the meta policy over the multiple models, so that the meta policy can quickly adapt to the real environment-system dynamics.  ...  It separates three concerns that are related to the adaptation policy and presents the modeling and synthesis process, with the goal of achieving higher model construction efficiency.  ...  reinforcement learning (MRL) is incorporated into the offline training phase and the online adaptation phase.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2105.04986v1">arXiv:2105.04986v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/mbwvpuye3vf75ou4dc5oqgpg7e">fatcat:mbwvpuye3vf75ou4dc5oqgpg7e</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210513061520/https://arxiv.org/pdf/2105.04986v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/b3/d6/b3d6f462a5f183ef5f856a79311bdbf738ff00d2.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2105.04986v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in Latent Space [article]

Kuan Fang, Patrick Yin, Ashvin Nair, Sergey Levine
<span title="2022-05-17">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Second, we propose a hybrid approach which first pre-trains both the conditional subgoal generator and the policy on previously collected data through offline reinforcement learning, and then fine-tunes  ...  First, we decompose the goal-reaching problem hierarchically, with a high-level planner that sets intermediate subgoals using conditional subgoal generators in the latent space for a low-level model-free  ...  RELATED WORK We propose to use a combination of optimization-based planning and fine-tuning with goal-conditioned reinforcement learning from prior data in order to allow robots to learn temporally extended  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2205.08129v1">arXiv:2205.08129v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/qeyaqj5vgjewpagp4v7laq7quq">fatcat:qeyaqj5vgjewpagp4v7laq7quq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220525002835/https://arxiv.org/pdf/2205.08129v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/15/3f/153fe0fa6aad8c16a66775ce3c7c750d810fa020.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2205.08129v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

The Profile of Students' Basic Teaching Skills Through Blended Learning in Microteaching Courses During Covid-19 Pandemic

Nurul Istiq'faroh
<span title="2022-02-23">2022</span> <i title="Universitas Pahlawan Tuanku Tambusai"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/q7k752zbnzhgrcgs3nsjbdiwjq" style="color: black;">Jurnal Basicedu</a> </i> &nbsp;
Due to the COVID-19 pandemic, prospective elementary school teachers must be able to teach both online and offline using a blended learning model.  ...  The data were collected using observations with a performance assessment rubric.  ...  ACKNOWLEDGMENT Finally, I would like to express my gratitude to the LPPM (Institute for Research and Community Service) of Universitas Nahdlatul Ulama Sidoarjo for giving financial assistance to the author  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.31004/basicedu.v6i2.2420">doi:10.31004/basicedu.v6i2.2420</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/apbuy5psdremnducskusf53ixy">fatcat:apbuy5psdremnducskusf53ixy</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220524153012/https://jbasic.org/index.php/basicedu/article/download/2420/pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/11/ac/11aca3040dd0fa86592605f5a1152269e0d0d700.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.31004/basicedu.v6i2.2420"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / doi.org </button> </a>

Offline Reinforcement Learning for Mobile Notifications [article]

Yiping Yuan, Ajith Muralidharan, Preetam Nandy, Miao Cheng, Prakruthi Prabhakar
<span title="2022-02-04">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Through simulations that approximate the notifications ecosystem, we demonstrate the performance and benefits of the offline evaluation approach as a part of the reinforcement learning modeling approach  ...  Finally, we collect data through online exploration in the production system, train an offline Double Deep Q-Network and launch a successful policy online.  ...  ACKNOWLEDGMENT We are thankful to Shaunak Chatterjee, Yan Gao, Cyrus DiCiccio, Bee-Chung Chen, Deepak Agawal, Matthew Walker, Romer rosales, Shipeng Yu and Mohsen Jamali for their detailed and insightful  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2202.03867v1">arXiv:2202.03867v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/v4yibo6htvc6jbwgh7s4f5rhyq">fatcat:v4yibo6htvc6jbwgh7s4f5rhyq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220210225957/https://arxiv.org/pdf/2202.03867v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/73/33/733335b9a87afa43b8ce0a18541dd9e65db99d4c.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2202.03867v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Bridging the Gap between Reinforcement Learning and Knowledge Representation: A Logical Off- and On-Policy Framework [article]

Emad Saad
<span title="2010-12-07">2010</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We show that the complexity of finding an offline and online policy for a model-free reinforcement learning problem in our approach is NP-complete.  ...  In this paper, we bridge the gap between reinforcement learning and knowledge representation, by providing a rich knowledge representation framework, based on normal logic programs with answer set semantics  ...  In addition, we introduced online and offline logical framework to model-free reinforcement learning by relating model-free reinforcement learning in MDP environment to normal logic programs with answer  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1012.1552v1">arXiv:1012.1552v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/pqgnvdzv55gkxa6lyednmz57hu">fatcat:pqgnvdzv55gkxa6lyednmz57hu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200922034352/https://arxiv.org/ftp/arxiv/papers/1012/1012.1552.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/8b/c3/8bc383d4616a7ce1eafeefb5eeca441c4648f937.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1012.1552v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Risk Sensitive Model-Based Reinforcement Learning using Uncertainty Guided Planning [article]

Stefan Radic Webster, Peter Flach
<span title="2021-11-09">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
In this paper, risk sensitivity is promoted in a model-based reinforcement learning algorithm by exploiting the ability of a bootstrap ensemble of dynamics models to estimate environment epistemic uncertainty  ...  Identifying uncertainty and taking mitigating actions is crucial for safe and trustworthy reinforcement learning agents, especially when deployed in high-risk environments.  ...  Acknowledgments and Disclosure of Funding We would like to thank Tom Bewley and Jonathan Thomas for their useful discussions and feedback while conceptualising this paper.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2111.04972v1">arXiv:2111.04972v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/4d2y4uqm5zcl7eelnddkrztx3m">fatcat:4d2y4uqm5zcl7eelnddkrztx3m</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20211114133720/https://arxiv.org/pdf/2111.04972v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/e3/df/e3dfa91f3ac6aad3fb66f0cead401dcf1ebb9076.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2111.04972v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Hierarchical Planning Through Goal-Conditioned Offline Reinforcement Learning [article]

Jinning Li, Chen Tang, Masayoshi Tomizuka, Wei Zhan
<span title="2022-05-24">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Offline Reinforcement learning (RL) has shown potent in many safe-critical tasks in robotics where exploration is risky and expensive.  ...  We improve the offline training to deal with out-of-distribution goals by a perturbed goal sampling process.  ...  In this work, we propose a hierarchical planning framework through goal-conditioned offline reinforcement learning.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2205.11790v1">arXiv:2205.11790v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/wbdblkpj2rehjdetl7vn4udziq">fatcat:wbdblkpj2rehjdetl7vn4udziq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220526105322/https://arxiv.org/pdf/2205.11790v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/f5/93/f593dc96b20ce8427182e773e3b2192d707706a8.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2205.11790v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Self-regulated in a Blended Learning: Case Study On Educational Doctoral candidates

Fitri April Yanti, Ahmad Walid, Habibi Habibi, M. Anas Thohir
<span title="2021-12-30">2021</span> <i title="LPPM IKIP Mataram"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/dsmoi24fabcafaxv3646sfeon4" style="color: black;">Prisma Sains : Jurnal Pengkajian Ilmu dan Pembelajaran Matematika dan IPA IKIP Mataram</a> </i> &nbsp;
The results show that planning for the completion of student assignments begins with (1) setting goals and planning strategies with (a) planning time for completing assignments; (b) Cooperation with peers  ...  Self-regulated in blended learning is described in four phases according to Zimmerman, namely planning, monitoring, evaluation, and reinforcing.  ...  According to Hrastinski (2019) blended learning is learning by combining offline and online. In online lectures, students use the zoom application as a learning medium.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.33394/j-ps.v9i2.4410">doi:10.33394/j-ps.v9i2.4410</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/liox4tp5sncgjawhxv3jjnsnom">fatcat:liox4tp5sncgjawhxv3jjnsnom</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220420215858/http://e-journal.undikma.ac.id/index.php/prismasains/article/download/4410/3083" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/18/cc/18ccfdad42ee9a49da1d03035a3c6de0d008c548.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.33394/j-ps.v9i2.4410"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / doi.org </button> </a>

Towards biologically plausible Dreaming and Planning [article]

Cristiano Capone, Pier Stanislao Paolucci
<span title="2022-05-20">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Recent model-based approaches show promising results by reducing the number of necessary interactions with the environment to learn a desirable policy.  ...  Importantly, our model does not require the detailed storage of experiences, and learns online the world-model.  ...  Our model provides a proof of concept that even small and cheap networks with online learning rule can learn and exploit world models to boost learning.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2205.10044v1">arXiv:2205.10044v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/kxhw64qc5bcuzmba2ihndojoaa">fatcat:kxhw64qc5bcuzmba2ihndojoaa</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220526200302/https://arxiv.org/pdf/2205.10044v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/fd/93/fd939053136bd822c95dee83fb5040ab7d70adf4.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2205.10044v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Offline Reinforcement Learning as One Big Sequence Modeling Problem [article]

Michael Janner, Qiyang Li, Sergey Levine
<span title="2021-11-29">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
To this end, we explore how RL can be tackled with the tools of sequence modeling, using a Transformer architecture to model distributions over trajectories and repurposing beam search as a planning algorithm  ...  Reinforcement learning (RL) is typically concerned with estimating stationary policies or single-step models, leveraging the Markov property to factorize problems in time.  ...  This work was partially supported by computational resource donations from Microsoft. M.J. is supported by fellowships from the National Science Foundation and the Open Philanthropy Project.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2106.02039v4">arXiv:2106.02039v4</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/y5edhwlrjje63jtbe7hxw3t7oq">fatcat:y5edhwlrjje63jtbe7hxw3t7oq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210723012454/https://arxiv.org/pdf/2106.02039v2.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/b3/60/b3607ec83ce23c22f06a982e2fb6357f39cd3444.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2106.02039v4" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Development of Hybrid Discovery Learning (HDL) Model for Integrated Social Studies Learning

Lukman Nadjamuddin, Sunarto Amus, Jamaludin Jamaludin, Sriati Usman, Idrus A. Rore, Nurgan Tadeko, Muhammad Zaky
<span title="2022-02-09">2022</span> <i title="PLUS COMMUNICATION CONSULTING SRL"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/d5n5kohxhfffrpdcgopk67cpxa" style="color: black;">Technium Social Sciences Journal</a> </i> &nbsp;
The developed model was tested with two different focus group discussions (FGD), followed by 24 teachers and instructors of the social studies, and tested for 12 classes.  ...  The Covid-19 crisis has led to a widening of the scope and role of online-based learning and information technology (IT)-based education.  ...  Acknowledgment: This study was supported by BLU fund of Tadulako University.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.47577/tssj.v28i1.5953">doi:10.47577/tssj.v28i1.5953</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/zb5btx6dsfczbb3khk4bbnmyqe">fatcat:zb5btx6dsfczbb3khk4bbnmyqe</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220423053436/https://techniumscience.com/index.php/socialsciences/article/download/5953/2048" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/68/d2/68d23cc455784fc2a5c103aba5a983f974c570db.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.47577/tssj.v28i1.5953"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / doi.org </button> </a>

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems [article]

Sergey Levine, Aviral Kumar, George Tucker, Justin Fu
<span title="2020-11-01">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Effective offline reinforcement learning methods would be able to extract policies with the maximum possible utility out of the available data, thereby allowing automation of a wide range of decision-making  ...  In this tutorial article, we aim to provide the reader with the conceptual tools needed to get started on research on offline reinforcement learning algorithms: reinforcement learning algorithms that utilize  ...  In online reinforcement learning (a), the policy π k is updated with streaming data collected by π k itself.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2005.01643v3">arXiv:2005.01643v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/kyw5xc4dijgz3dpuytnbcrmlam">fatcat:kyw5xc4dijgz3dpuytnbcrmlam</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201106132758/https://arxiv.org/pdf/2005.01643v3.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/36/96/369679dafc188d7dbc0580a35586a8dcbe5d2016.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2005.01643v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 16,904 results