744 Hits in 6.1 sec

Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning [article]

Aniket Didolkar, Kshitij Gupta, Anirudh Goyal, Alex Lamb, Nan Rosemary Ke, Yoshua Bengio
2022 arXiv   pre-print
At the same time, a fast stream is parameterized as a Transformer to process chunks consisting of K time-steps conditioned on the information in the slow-stream.  ...  Recurrent neural networks have a strong inductive bias towards learning temporally compressed representations, as the entire history of a sequence is represented by a single vector.  ...  We use a chunk size of 20 and set the number of temporal latent bottleneck state vectors to 20. For training, we use Adam optimizer with a learning rate of 0.0001. We train the model for 5000 steps.  ... 
arXiv:2205.14794v1 fatcat:rqcendk3xjanhfwp22rlvd6mme

SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition [article]

Rishabh Kabra, Daniel Zoran, Goker Erdogan, Loic Matthey, Antonia Creswell, Matthew Botvinick, Alexander Lerchner, Christopher P. Burgess
2021 arXiv   pre-print
We demonstrate these capabilities, as well as the model's performance in terms of view synthesis and instance segmentation, across three procedurally generated video datasets.  ...  Leveraging the shared structure that exists across different scenes, our model learns to infer two sets of latent representations from RGB video input alone: a set of "object" latents, corresponding to  ...  Grounded language learning fast and slow. arXiv preprint arXiv:2009.01719, 2020. [70] Nicholas Watters, Loic Matthey, Christopher P Burgess, and Alexander Lerchner.  ... 
arXiv:2106.03849v2 fatcat:vtgeigfcrbgj5jnwka2e2djgxe

CCVS: Context-aware Controllable Video Synthesis [article]

Guillaume Le Moing and Jean Ponce and Cordelia Schmid
2021 arXiv   pre-print
the synthesis process on contextual information for temporal continuity and ancillary information for fine control.  ...  by affording simple mechanisms for handling multimodal ancillary information for controlling the synthesis process (eg, a few sample frames, an audio track, a trajectory in image space) and taking into  ...  JP was supported in part by the Louis Vuitton/ENS chair in artificial intelligence and the Inria/NYU collaboration. We thank the reviewers for useful comments.  ... 
arXiv:2107.08037v2 fatcat:xmpykxzkz5cxrppejzlc5u6i6i

Learning the rules of cell competition without prior scientific knowledge [article]

Christopher Soelistyo, Giulia Vallardi, Guillaume Charras, Alan R Lowe
2021 bioRxiv   pre-print
Deep learning is now a powerful tool in microscopy data analysis, and is routinely used for image processing applications such as segmentation and denoising.  ...  Using the τ-VAE's latent representation of the local tissue organization and the flow of information in the network, we decode the physical parameters responsible for correct prediction of fate in cell  ...  ARL and GC wish to acknowledge the support of BBSRC grant BB/S009329/1.  ... 
doi:10.1101/2021.11.24.469554 fatcat:l5lpkheqrjbqvdi6p7qbmpwxpm

Deep generative models for musical audio synthesis [article]

M. Huzaifah, L. Wyse
2020 arXiv   pre-print
This paper is a review of developments in deep learning that are changing the practice of sound modelling.  ...  Recent generative deep learning systems for audio synthesis are able to learn models that can traverse arbitrary spaces of sound defined by the data they train on.  ...  Acknowledgements This research was supported by a Singapore MOE Tier 2 grant, "Learning Generative Recurrent Neural Networks," and by an NVIDIA Corporation Academic Programs GPU grant.  ... 
arXiv:2006.06426v2 fatcat:swt7npt3gnbj5ppzcf2ef3rose

A Survey on Neural Speech Synthesis [article]

Xu Tan, Tao Qin, Frank Soong, Tie-Yan Liu
2021 arXiv   pre-print
Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural speech given text, is a hot research topic in speech, language, and machine learning communities and has broad  ...  As the development of deep learning and artificial intelligence, neural network-based TTS has significantly improved the quality of synthesized speech in recent years.  ...  Specifically, as TTS is a typical sequence to sequence generation task with slow autoregressive generation, how to speed up the autoregressive generation or reduce the model size for fast speech synthesis  ... 
arXiv:2106.15561v3 fatcat:pbrbs6xay5e4fhf4ewlp7qvybi

Style-ERD: Responsive and Coherent Online Motion Style Transfer [article]

Tianxin Tao, Xiaohang Zhan, Zhongquan Chen, Michiel van de Panne
2022 arXiv   pre-print
and temporal attention.  ...  Although our method targets online settings, it outperforms previous offline methods in motion realism and style expressiveness and provides significant gains in runtime efficiency  ...  However, the sequential nature slows down the runtime of our method when used in conjunction with a long input sequence, although this is not our focus.  ... 
arXiv:2203.02574v2 fatcat:u75ufuqkmfhlldjk4bbvboopay

Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons [article]

Paul Haider, Benjamin Ellenberger, Laura Kriener, Jakob Jordan, Walter Senn, Mihai A. Petrovici
2021 arXiv   pre-print
We introduce Latent Equilibrium, a new framework for inference and learning in networks of slow components which avoids these issues by harnessing the ability of biological neurons to phase-advance their  ...  This inherent property of physical dynamical systems results in delayed processing of stimuli and causes a timing mismatch between network output and instructive signals, thus afflicting not only inference  ...  Furthermore, we thank Mathieu Le Douairon, Reinhard Dietrich and the Insel Data Science Center for the usage and outstanding support of their Research HPC Cluster.  ... 
arXiv:2110.14549v1 fatcat:gbypwhc4rfcjbh5s5qlm7ds67a

End-to-end Neural Video Coding Using a Compound Spatiotemporal Representation [article]

Haojie Liu, Ming Lu, Zhiqi Chen, Xun Cao, Zhan Ma, Yao Wang
2021 arXiv   pre-print
and H.265/HEVC, as well as recently published learning-based methods, in terms of both PSNR and MS-SSIM metrics.  ...  In spite of the great success of adaptive kernel-based resampling (e.g., adaptive convolutions and deformable convolutions) in video prediction for uncompressed videos, integrating such approaches with  ...  Such methods often fail in regions with occlusions and fast motions because of the limitation of the bilinear warping in the pixel domain.  ... 
arXiv:2108.04103v1 fatcat:u43qnoz5pfgmvo3lwsmvr4kk4m

Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning [article]

Ligong Han and Jian Ren and Hsin-Ying Lee and Francesco Barbieri and Kyle Olszewski and Shervin Minaee and Dimitris Metaxas and Sergey Tulyakov
2022 arXiv   pre-print
To improve video quality and consistency, we propose a new video token trained with self-learning and an improved mask-prediction algorithm for sampling video tokens.  ...  We introduce text augmentation to improve the robustness of the textual representation and diversity of generated videos.  ...  The textual input also controls the speed, where "slow" indicates videos with slow speed such that the motion is slow, while "fast" indicates the performed motion is fast.  ... 
arXiv:2203.02573v1 fatcat:77g4rpc6vfayhgjrkdn66rcwdy

Futuristic methods in virus genome evolution using the Third-Generation DNA sequencing and artificial neural networks [article]

Hyunjin Shim
2019 arXiv   pre-print
Applications of the Third-Generation sequencing enable real-time and on-site data production, changing the research paradigms in environmental and medical sampling in virology.  ...  The Third-Generation in DNA sequencing has emerged in the last few years using new technologies that allow the production of long-read sequences.  ...  + ℎ − ( ) ℎ (15) The computation of an analytic gradient is fast and exact but error prone. The computation of a numerical gradient is easy to write, but slow and approximate.  ... 
arXiv:1902.09148v1 fatcat:p74hb5tvkfdcbkz4nba6gvyvca

Deep Learning for Source Code Modeling and Generation: Models, Applications and Challenges [article]

Triet H. M. Le, Hao Chen, M. Ali Babar
2020 arXiv   pre-print
Deep Learning (DL) techniques for Natural Language Processing have been evolving remarkably fast.  ...  field of program learning.  ...  Attention mechanism. Attention mechanism can be used for addressing both the OoV issue and the hidden-state bottleneck of RNNs. Bhoopchand et al.  ... 
arXiv:2002.05442v1 fatcat:bt7dtzrcnjfk5jn6kmin2ruqii

The human Turing machine: a neural framework for mental programs

Ariel Zylberberg, Stanislas Dehaene, Pieter R. Roelfsema, Mariano Sigman
2011 Trends in Cognitive Sciences  
In recent years much has been learned about how a single computational processing step is implemented in the brain.  ...  Despite these profound differences in architecture, the human brain can be surprisingly slow and serial in executing certain tasks (Box 1).  ...  SD is supported by a senior grant of the European Research Council. PRR was supported by a NWO-VICI grant.  ... 
doi:10.1016/j.tics.2011.05.007 pmid:21696998 fatcat:rehrgtumvneazh2fcxxeki6l7e

Deep Gait Recognition: A Survey [article]

Alireza Sepas-Moghaddam, Ali Etemad
2022 arXiv   pre-print
In this paper, we present a comprehensive overview of breakthroughs and recent developments in gait recognition with deep learning, and cover broad topics including datasets, test protocols, state-of-the-art  ...  Gait recognition methods based on deep learning now dominate the state-of-the-art in the field and have fostered real-world applications.  ...  In the first approach, the temporal dynamics over the sequences are learned using recurrent learning strategies, for example recurrent neural networks, where each frame is processed with respect to its  ... 
arXiv:2102.09546v2 fatcat:iwzddzjy2rhunbuqz7h6tso5ii

A Review on Deep Learning Techniques for Video Prediction [article]

Sergiu Oprea, Pablo Martinez-Gonzalez, Alberto Garcia-Garcia, John Alejandro Castro-Vargas, Sergio Orts-Escolano, Jose Garcia-Rodriguez, Antonis Argyros
2020 arXiv   pre-print
Motivated by the increasing interest in this task, we provide a review on the deep learning methods for prediction in video sequences.  ...  In light of the success of deep learning in computer vision, deep-learning-based video prediction emerged as a promising research direction.  ...  Predictions of large and fast-moving objects are accurate, however, when it comes to small and slow-moving objects there is still room for improvement.  ... 
arXiv:2004.05214v2 fatcat:weerbkanmjb4dn6wkn5o4b5aia
« Previous Showing results 1 — 15 out of 744 results