23 Hits in 0.96 sec

Learning from the memory of Atari 2600 [article]

Jakub Sygnowski, Henryk Michalewski
2016 arXiv   pre-print
We train a number of neural networks to play games Bowling, Breakout and Seaquest using information stored in the memory of a video game console Atari 2600. We consider four models of neural networks which differ in size and architecture: two networks which use only information contained in the RAM and two mixed networks which use both information in the RAM and information from the screen. As the benchmark we used the convolutional model proposed in NIPS and received comparable results in all
more » ... onsidered games. Quite surprisingly, in the case of Seaquest we were able to train RAM-only agents which behave better than the benchmark screen-only agent. Mixing screen and RAM did not lead to an improved performance comparing to screen-only and RAM-only agents.
arXiv:1605.01335v1 fatcat:l7eyhl3hsvg3ldkbyb455g25fe

Sequential Changepoint Detection in Neural Networks with Checkpoints [article]

Michalis K. Titsias, Jakub Sygnowski, Yutian Chen
2020 arXiv   pre-print
We introduce a framework for online changepoint detection and simultaneous model learning which is applicable to highly parametrized models, such as deep neural networks. It is based on detecting changepoints across time by sequentially performing generalized likelihood ratio tests that require only evaluations of simple prediction score functions. This procedure makes use of checkpoints, consisting of early versions of the actual model parameters, that allow to detect distributional changes by
more » ... performing predictions on future data. We define an algorithm that bounds the Type I error in the sequential testing procedure. We demonstrate the efficiency of our method in challenging continual learning applications with unknown task changepoints, and show improved performance compared to online Bayesian changepoint detection.
arXiv:2010.03053v1 fatcat:wdzbjqnysfeqjm7dkspvducbjy

Importance Weighted Policy Learning and Adaptation [article]

Alexandre Galashov, Jakub Sygnowski, Guillaume Desjardins, Jan Humplik, Leonard Hasenclever, Rae Jeong, Yee Whye Teh, Nicolas Heess
2021 arXiv   pre-print
The ability to exploit prior experience to solve novel problems rapidly is a hallmark of biological learning systems and of great practical importance for artificial ones. In the meta reinforcement learning literature much recent work has focused on the problem of optimizing the learning process itself. In this paper we study a complementary approach which is conceptually simple, general, modular and built on top of recent improvements in off-policy learning. The framework is inspired by ideas
more » ... rom the probabilistic inference literature and combines robust off-policy learning with a behavior prior, or default behavior that constrains the space of solutions and serves as a bias for exploration; as well as a representation for the value function, both of which are easily learned from a number of training tasks in a multi-task scenario. Our approach achieves competitive adaptation performance on hold-out tasks compared to meta reinforcement learning baselines and can scale to complex sparse-reward scenarios.
arXiv:2009.04875v2 fatcat:7376gfqvqfa4xfq3di35imcqgu

Meta-Learning with Latent Embedding Optimization [article]

Andrei A. Rusu, Dushyant Rao, Jakub Sygnowski, Oriol Vinyals, Razvan Pascanu, Simon Osindero, Raia Hadsell
2019 arXiv   pre-print
Gradient-based meta-learning techniques are both widely applicable and proficient at solving challenging few-shot learning and fast adaptation problems. However, they have practical difficulties when operating on high-dimensional parameter spaces in extreme low-data regimes. We show that it is possible to bypass these limitations by learning a data-dependent latent generative representation of model parameters, and performing gradient-based meta-learning in this low-dimensional latent space.
more » ... resulting approach, latent embedding optimization (LEO), decouples the gradient-based adaptation procedure from the underlying high-dimensional space of model parameters. Our evaluation shows that LEO can achieve state-of-the-art performance on the competitive miniImageNet and tieredImageNet few-shot classification tasks. Further analysis indicates LEO is able to capture uncertainty in the data, and can perform adaptation more effectively by optimizing in latent space.
arXiv:1807.05960v3 fatcat:mzhtivi6s5cjjjifiznmvkac5y

An Empirical Study of Implicit Regularization in Deep Offline RL [article]

Caglar Gulcehre, Srivatsan Srinivasan, Jakub Sygnowski, Georg Ostrovski, Mehrdad Farajtabar, Matt Hoffman, Razvan Pascanu, Arnaud Doucet
2022 arXiv   pre-print
Deep neural networks are the most commonly used function approximators in offline reinforcement learning. Prior works have shown that neural nets trained with TD-learning and gradient descent can exhibit implicit regularization that can be characterized by under-parameterization of these networks. Specifically, the rank of the penultimate feature layer, also called effective rank, has been observed to drastically collapse during the training. In turn, this collapse has been argued to reduce the
more » ... model's ability to further adapt in later stages of learning, leading to the diminished final performance. Such an association between the effective rank and performance makes effective rank compelling for offline RL, primarily for offline policy evaluation. In this work, we conduct a careful empirical study on the relation between effective rank and performance on three offline RL datasets : bsuite, Atari, and DeepMind lab. We observe that a direct association exists only in restricted settings and disappears in the more extensive hyperparameter sweeps. Also, we empirically identify three phases of learning that explain the impact of implicit regularization on the learning dynamics and found that bootstrapping alone is insufficient to explain the collapse of the effective rank. Further, we show that several other factors could confound the relationship between effective rank and performance and conclude that studying this association under simplistic assumptions could be highly misleading.
arXiv:2207.02099v2 fatcat:23kvutruqvahzdoodbas7eyeiu

Open-Ended Learning Leads to Generally Capable Agents [article]

Open Ended Learning Team, Adam Stooke, Anuj Mahajan, Catarina Barros, Charlie Deck, Jakob Bauer, Jakub Sygnowski, Maja Trebacz, Max Jaderberg, Michael Mathieu, Nat McAleese, Nathalie Bradley-Schmieg (+6 others)
2021 arXiv   pre-print
Jakub Sygnowski: Infrastructure development, agent analysis, and research investigations. Maja Trebacz: Research investigations.  ... 
arXiv:2107.12808v2 fatcat:wp5lbeezmrb6rdsbqf6etgurtq

Meta-learning by the baldwin effect

Chrisantha Fernando, Jakub Sygnowski, Simon Osindero, Jane Wang, Tom Schaul, Denis Teplyashin, Pablo Sprechmann, Alexander Pritzel, Andrei Rusu
2018 Proceedings of the Genetic and Evolutionary Computation Conference Companion on - GECCO '18  
The scope of the Baldwin effect was recently called into question by two papers that closely examined the seminal work of Hinton and Nowlan. To this date there has been no demonstration of its necessity in empirically challenging tasks. Here we show that the Baldwin effect is capable of evolving few-shot supervised and reinforcement learning mechanisms, by shaping the hyperparameters and the initial parameters of deep learning algorithms. Furthermore it can genetically accommodate strong
more » ... g biases on the same set of problems as a recent machine learning algorithm called MAML "Model Agnostic Meta-Learning" which uses second-order gradients instead of evolution to learn a set of reference parameters (initial weights) that can allow rapid adaptation to tasks sampled from a distribution. Whilst in simple cases MAML is more data efficient than the Baldwin effect, the Baldwin effect is more general in that it does not require gradients to be backpropagated to the reference parameters or hyperparameters, and permits effectively any number of gradient updates in the inner loop. The Baldwin effect learns strong learning dependent biases, rather than purely genetically accommodating fixed behaviours in a learning independent manner.
doi:10.1145/3205651.3205763 dblp:conf/gecco/FernandoSOWSTSP18 fatcat:gdtvorekirgsvhhs5acq3dufaq

Theano: A Python framework for fast computation of mathematical expressions [article]

The Theano Development Team: Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson (+90 others)
2016 arXiv   pre-print
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many
more » ... machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.
arXiv:1605.02688v1 fatcat:2lcqwrk2zrbt5dyjmcofn6shhu

Dual Path Structural Contrastive Embeddings for Learning Novel Objects [article]

Bingbin Li, Elvis Han Cui, Yanan Li, Donghui Wang, Weng Kee Wong
2022 arXiv   pre-print
PMLR, 2020. [50] Andrei A Rusu, Dushyant Rao, Jakub Sygnowski, Oriol Vinyals, Razvan Pascanu, Simon Osindero, and Raia Hadsell.  ...  PMLR, 13–18 Jul 2020. [49] Andrei A Rusu, Dushyant Rao, Jakub Sygnowski, Oriol Vinyals, Razvan [69] Imtiaz Ziko, Jose Dolz, Eric Granger, and Ismail Ben Ayed.  ... 
arXiv:2112.12359v3 fatcat:x3utnehujrbwrirkap7qvu3v4i

Adaptive Cross-Modal Few-Shot Learning [article]

Chen Xing, Negar Rostamzadeh, Boris N. Oreshkin, Pedro O. Pinheiro
2020 arXiv   pre-print
IJCV, 2015. [41] Andrei A Rusu, Dushyant Rao, Jakub Sygnowski, Oriol Vinyals, Razvan Pascanu, Simon Osindero, and Raia Hadsell. Meta-learning with latent embedding optimization.  ...  In CVPR, 2005. 8 [2] Matthias Bauer, Mateo Rojas-Carulla, Jakub Bartlomiej Swikatkowski, Bernhard Scholkopf, and Richard E Turner.  ... 
arXiv:1902.07104v3 fatcat:mmjrixwfkzg7fknqaxlgjggdki

CLTA: Contents and Length-based Temporal Attention for Few-shot Action Recognition [article]

Yang Bo, Yangdi Lu, Wenbo He
2021 arXiv   pre-print
In Proceedings of the [29] Andrei A Rusu, Dushyant Rao, Jakub Sygnowski, Oriol IEEE Conference on Computer Vision and Pattern Recog- Vinyals, Razvan Pascanu, Simon Osindero, and Raia  ... 
arXiv:2103.10567v1 fatcat:yl7extxpbbgxpodxgnmtnfma2q

Vector Quantized Models for Planning [article]

Sherjil Ozair, Yazhe Li, Ali Razavi, Ioannis Antonoglou, Aäron van den Oord, Oriol Vinyals
2021 arXiv   pre-print
We'd also like to thank Sander Dielman, David Silver, Yoshua Bengio, Jakub Sygnowski, and Aravind Srinivas for useful discussions and suggestions.  ... 
arXiv:2106.04615v2 fatcat:kxuhklldsfcpretmxdgzafdxvu

Concept Learners for Few-Shot Learning [article]

Kaidi Cao, Maria Brbic, Jure Leskovec
2021 arXiv   pre-print
Andrei A Rusu, Dushyant Rao, Jakub Sygnowski, Oriol Vinyals, Razvan Pascanu, Simon Osindero, and Raia Hadsell. Meta-learning with latent embedding optimization.  ... 
arXiv:2007.07375v3 fatcat:4canqso2azhibnjootonpva6xa

Memory-Efficient Semi-Supervised Continual Learning: The World is its Own Replay Buffer [article]

James Smith, Jonathan Balloch, Yen-Chang Hsu, Zsolt Kira
2021 arXiv   pre-print
Xu He, Jakub Sygnowski, Alexandre Galashov, Andrei A Rusu, Yee Whye Teh, and Razvan Pascanu. Task agnostic continual learning via meta learning. arXiv preprint arXiv:1906.05201, 2019.  ... 
arXiv:2101.09536v2 fatcat:inyfzbclgfbw7mrej4cxh5q5ze

NOVAS: Non-convex Optimization via Adaptive Stochastic Search for End-to-End Learning and Control [article]

Ioannis Exarchos and Marcus A. Pereira and Ziyi Wang and Evangelos A. Theodorou
2021 arXiv   pre-print
Andrei A Rusu, Dushyant Rao, Jakub Sygnowski, Oriol Vinyals, Razvan Pascanu, Simon Osindero, and Raia Hadsell. Meta-learning with latent embedding optimization. arXiv preprint arXiv:1807.05960, 2018.  ... 
arXiv:2006.11992v3 fatcat:r4tgjunt7nhznbpncjocvvxmk4
« Previous Showing results 1 — 15 out of 23 results