46 Hits in 0.54 sec

Recurrent Neural Processes [article]

Timon Willi, Jonathan Masci, Jürgen Schmidhuber, Christian Osendorfer
2019 arXiv   pre-print
We extend Neural Processes (NPs) to sequential data through Recurrent NPs or RNPs, a family of conditional state space models. RNPs model the state space with Neural Processes. Given time series observed on fast real-world time scales but containing slow long-term variabilities, RNPs may derive appropriate slow latent time scales. They do so in an efficient manner by establishing conditional independence among subsequences of the time series. Our theoretically grounded framework for stochastic
more » ... rocesses expands the applicability of NPs while retaining their benefits of flexibility, uncertainty estimation, and favorable runtime with respect to Gaussian Processes (GPs). We demonstrate that state spaces learned by RNPs benefit predictive performance on real-world time-series data and nonlinear system identification, even in the case of limited data availability.
arXiv:1906.05915v2 fatcat:ytwzcvc27ffyrhydu3ko3c4gii

Improving approximate RPCA with a k-sparsity prior [article]

Maximilian Karl, Christian Osendorfer
2014 arXiv   pre-print
A process centric view of robust PCA (RPCA) allows its fast approximate implementation based on a special form o a deep neural network with weights shared across all layers. However, empirically this fast approximation to RPCA fails to find representations that are parsemonious. We resolve these bad local minima by relaxing the elementwise L1 and L2 priors and instead utilize a structure inducing k-sparsity prior. In a discriminative classification task the newly learned representations
more » ... rm these from the original approximate RPCA formulation significantly.
arXiv:1412.8291v1 fatcat:selmfj3eubcq5oedhllbbnyv4a

Learning Sequence Neighbourhood Metrics [article]

Justin Bayer and Christian Osendorfer and Patrick van der Smagt
2013 arXiv   pre-print
Recurrent neural networks (RNNs) in combination with a pooling operator and the neighbourhood components analysis (NCA) objective function are able to detect the characterizing dynamics of sequences and embed them into a fixed-length vector space of arbitrary dimensionality. Subsequently, the resulting features are meaningful and can be used for visualization or nearest neighbour classification in linear time. This kind of metric learning for sequential data enables the use of algorithms tailored towards fixed length vector spaces such as R^n.
arXiv:1109.2034v2 fatcat:3pemisnujzeu5eultx64lu6htm

No Representation without Transformation [article]

Giorgio Giannone, Saeed Saremi, Jonathan Masci, Christian Osendorfer
2020 arXiv   pre-print
Correspondence to: <>, <>, <>.  ... 
arXiv:1912.03845v2 fatcat:tkjmng2pjbfk7lwcxed7iskr7q

Learning Stochastic Recurrent Networks [article]

Justin Bayer, Christian Osendorfer
2015 arXiv   pre-print
Leveraging advances in variational inference, we propose to enhance recurrent neural networks with latent variables, resulting in Stochastic Recurrent Networks (STORNs). The model i) can be trained with stochastic gradient methods, ii) allows structured and multi-modal conditionals at each time step, iii) features a reliable estimator of the marginal likelihood and iv) is a generalisation of deterministic recurrent neural networks. We evaluate the method on four polyphonic musical data sets and motion capture data.
arXiv:1411.7610v3 fatcat:vdft7yrvznhvbcb3w4iggbbkay

Policy Gradients for Cryptanalysis [chapter]

Frank Sehnke, Christian Osendorfer, Jan Sölter, Jürgen Schmidhuber, Ulrich Rührmair
2010 Lecture Notes in Computer Science  
So-called Physical Unclonable Functions are an emerging, new cryptographic and security primitive. They can potentially replace secret binary keys in vulnerable hardware systems and have other security advantages. In this paper, we deal with the cryptanalysis of this new primitive by use of machine learning methods. In particular, we investigate to what extent the security of circuit-based PUFs can be challenged by a new machine learning technique named Policy Gradients with Parameter-based
more » ... oration. Our findings show that this technique has several important advantages in cryptanalysis of Physical Unclonable Functions compared to other machine learning fields and to other policy gradient methods.
doi:10.1007/978-3-642-15825-4_22 fatcat:qvpo5lqk2fb5zf6rqfjwgoqldq

Learning Sequence Neighbourhood Metrics [chapter]

Justin Bayer, Christian Osendorfer, Patrick van der Smagt
2012 Lecture Notes in Computer Science  
doi:10.1007/978-3-642-33269-2_67 fatcat:6aueoleg6rhjtdffowr53kljeu

Sequential Feature Selection for Classification [chapter]

Thomas Rückstieß, Christian Osendorfer, Patrick van der Smagt
2011 Lecture Notes in Computer Science  
In most real-world information processing problems, data is not a free resource; its acquisition is rather time-consuming and/or expensive. We investigate how these two factors can be included in supervised classification tasks by deriving classification as a sequential decision process and making it accessible to Reinforcement Learning. Our method performs a sequential feature selection that learns which features are most informative at each timestep, choosing the next feature depending on the
more » ... already selected features and the internal belief of the classifier. Experiments on a handwritten digits classification task show significant reduction in required data for correct classification, while a medical diabetes prediction task illustrates variable feature cost minimization as a further property of our algorithm.
doi:10.1007/978-3-642-25832-9_14 fatcat:kltj54r4b5blhoofaezqidnqlq

Separator-Transducer-Segmenter: Streaming Recognition and Segmentation of Multi-party Speech [article]

Ilya Sklyar, Anna Piunova, Christian Osendorfer
2022 arXiv   pre-print
Streaming recognition and segmentation of multi-party conversations with overlapping speech is crucial for the next generation of voice assistant applications. In this work we address its challenges discovered in the previous work on multi-turn recurrent neural network transducer (MT-RNN-T) with a novel approach, separator-transducer-segmenter (STS), that enables tighter integration of speech separation, recognition and segmentation in a single model. First, we propose a new segmentation
more » ... g strategy through start-of-turn and end-of-turn tokens that improves segmentation without recognition accuracy degradation. Second, we further improve both speech recognition and segmentation accuracy through an emission regularization method, FastEmit, and multi-task training with speech activity information as an additional training signal. Third, we experiment with end-of-turn emission latency penalty to improve end-point detection for each speaker turn. Finally, we establish a novel framework for segmentation analysis of multi-party conversations through emission latency metrics. With our best model, we report 4.6% abs. turn counting accuracy improvement and 17% rel. word error rate (WER) improvement on LibriCSS dataset compared to the previously published work.
arXiv:2205.05199v1 fatcat:n7a37hcpf5ddtg7bkjli6e6zaq

Deep Iterative Surface Normal Estimation [article]

Jan Eric Lenssen, Christian Osendorfer, Jonathan Masci
2020 arXiv   pre-print
This paper presents an end-to-end differentiable algorithm for robust and detail-preserving surface normal estimation on unstructured point-clouds. We utilize graph neural networks to iteratively parameterize an adaptive anisotropic kernel that produces point weights for weighted least-squares plane fitting in local neighborhoods. The approach retains the interpretability and efficiency of traditional sequential plane fitting while benefiting from adaptation to data set statistics through deep
more » ... earning. This results in a state-of-the-art surface normal estimator that is robust to noise, outliers and point density variation, preserves sharp features through anisotropic kernels and equivariance through a local quaternion-based spatial transformer. Contrary to previous deep learning methods, the proposed approach does not require any hand-crafted features or preprocessing. It improves on the state-of-the-art results while being more than two orders of magnitude faster and more parameter efficient.
arXiv:1904.07172v3 fatcat:m244uugwr5dvdm55zhylimu4gy

Training Neural Networks with Implicit Variance [chapter]

Justin Bayer, Christian Osendorfer, Sebastian Urban, Patrick van der Smagt
2013 Lecture Notes in Computer Science  
We present a novel method to train predictive Gaussian distributions p(z|x) for regression problems with neural networks. While most approaches either ignore or explicitly model the variance as another response variable, it is trained implicitly in our case. Establishing stochasticty by the injection of noise into the input and hidden units, the outputs are approximated with a Gaussian distribution by the forward propagation method introduced for fast dropout [1]. We have designed our method to
more » ... respect that probabilistic interpretation of the output units in the loss function. The method is evaluated on a synthetic and a inverse robot dynamics task, yielding superior performance to plain neural networks, Gaussian processes and LWPR in terms of mean squared error and likelihood.
doi:10.1007/978-3-642-42042-9_17 fatcat:bjggng4xgvgtvnsumdrtoto6iy

Parameter-exploring policy gradients

Frank Sehnke, Christian Osendorfer, Thomas Rückstieß, Alex Graves, Jan Peters, Jürgen Schmidhuber
2010 Neural Networks  
We present a model-free reinforcement learning method for partially observable Markov decision problems. Our method estimates a likelihood gradient by sampling directly in parameter space, which leads to lower variance gradient estimates than obtained by regular policy gradient methods. We show that for several complex control tasks, including robust standing with a humanoid robot, this method outperforms well-known algorithms from the fields of standard policy gradients, finite difference
more » ... ds and population based heuristics. We also show that the improvement is largest when the parameter samples are drawn symmetrically. Lastly we analyse the importance of the individual components of our method by incrementally incorporating them into the other algorithms, and measuring the gain in performance after each step.
doi:10.1016/j.neunet.2009.12.004 pmid:20061118 fatcat:ali2i7d7ibakhf64ycmdsv3h5u

Convolutional Neural Networks learn compact local image descriptors [article]

Christian Osendorfer, Justin Bayer, Patrick van der Smagt
2013 arXiv   pre-print
A standard deep convolutional neural network paired with a suitable loss function learns compact local image descriptors that perform comparably to state-of-the art approaches.
arXiv:1304.7948v2 fatcat:aqqk5tcbgzbuxlui4pct5lcgni

Variational inference of latent state sequences using Recurrent Networks [article]

Justin Bayer, Christian Osendorfer
2014 arXiv   pre-print
Recent advances in the estimation of deep directed graphical models and recurrent networks let us contribute to the removal of a blind spot in the area of probabilistc modelling of time series. The proposed methods i) can infer distributed latent state-space trajectories with nonlinear transitions, ii) scale to large data sets thanks to the use of a stochastic objective and fast, approximate inference, iii) enable the design of rich emission models which iv) will naturally lead to structured
more » ... puts. Two different paths of introducing latent state sequences are pursued, leading to the variational recurrent auto encoder (VRAE) and the variational one step predictor (VOSP). The use of independent Wiener processes as priors on the latent state sequence is a viable compromise between efficient computation of the Kullback-Leibler divergence from the variational approximation of the posterior and maintaining a reasonable belief in the dynamics. We verify our methods empirically, obtaining results close or superior to the state of the art. We also show qualitative results for denoising and missing value imputation.
arXiv:1406.1655v2 fatcat:dj4emg5bmbbnxfsueyojc3ka44

Convolutional Neural Networks Learn Compact Local Image Descriptors [chapter]

Christian Osendorfer, Justin Bayer, Sebastian Urban, Patrick van der Smagt
2013 Lecture Notes in Computer Science  
A standard deep convolutional neural network paired with a suitable loss function learns compact local image descriptors that perform comparably to state-of-the art approaches.
doi:10.1007/978-3-642-42051-1_77 fatcat:uhnexdak6ncvvanujj6ndrx3pm
« Previous Showing results 1 — 15 out of 46 results