A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Learning ReLU Networks via Alternating Minimization
[article]
2018
arXiv
pre-print
Our algorithms are based on the technique of alternating minimization: estimating the activation patterns of each ReLU for all given samples, interleaved with weight updates via a least-squares step. ...
We propose and analyze a new family of algorithms for training neural networks with ReLU activations. ...
Algorithm and Analysis We now propose our alternating minimization-based framework to learn shallow networks with ReLU activations using a ℓ 2 loss function. ...
arXiv:1806.07863v2
fatcat:nrvn4bifubc43hhfmpzxttlpy4
Concept and Experimental Demonstration of Optical IM/DD End-to-End System Optimization using a Generative Model
[article]
2019
arXiv
pre-print
We perform an experimental end-to-end transceiver optimization via deep learning using a generative adversarial network to approximate the test-bed channel. ...
Conclusions We experimentally implemented deep learning of a transceiver based on measured data, an important step towards practical end-to-end optimized transmission. ...
It enabled gradient backpropagation in an iteration of the end-to-end system learning. We observed a monotonic decrease in BER on each step of the optimization. ...
arXiv:1912.05146v2
fatcat:rwae6rnqx5d6dkzcv5jv6na2hq
OMLT: Optimization Machine Learning Toolkit
[article]
2022
arXiv
pre-print
The optimization and machine learning toolkit (OMLT) is an open-source software package incorporating neural network and gradient-boosted tree surrogate models, which have been trained using machine learning ...
An existing alternative for ReLU activation functions is the ReLUPartitionFormulation. ...
as the gateway to several alternative mathematical optimization formulations of neural networks. ...
arXiv:2202.02414v1
fatcat:b5o2qj3gj5b7nh5qvy5nem4xum
Generative-Discriminative Variational Model for Visual Recognition
[article]
2017
arXiv
pre-print
Despite the promising power of deep neural networks (DNN), how to alleviate overfitting during training has been a research topic of interest. ...
The paradigm shift from shallow classifiers with hand-crafted features to end-to-end trainable deep learning models has shown significant improvements on supervised learning tasks. ...
The detailed network structure is presented in the supplementary, which is learned via SGD optimizer with learning rate 0.05 and momentum 0.9. ...
arXiv:1706.02295v1
fatcat:gnu5xpmprjf4rfberm4mvmsuoy
Contrastive Learning for Lifted Networks
[article]
2019
arXiv
pre-print
In this work we address supervised learning of neural networks via lifted network formulations. ...
Lifted networks are interesting because they allow training on massively parallel hardware and assign energy models to discriminatively trained neural networks. ...
Learning with lifted networks If a training set {(x i , y i )} N i=1 is given, then learning with lifted networks is performed by jointly minimizing over weights and activations [21, 24, 2] , J 0 (W) ...
arXiv:1905.02507v2
fatcat:iwgqovxazbfvbhj73opqbxhepa
Inverting Adversarially Robust Networks for Image Synthesis
[article]
2022
arXiv
pre-print
Comparisons against recent learn-based methods show that our model attains improved performance with significantly less complexity. ...
On the largest one (CBSD68), it also outperforms alternative learn-based techniques. ...
Adversarial training adds perturbations to the input data and lets the network learn how to classify in the presence of such adversarial attacks [3, 22, 34] . ...
arXiv:2106.06927v3
fatcat:iwuegtgh6jcu5f4hcgwvov25kq
Tensor Switching Networks
[article]
2016
arXiv
pre-print
Our experimental results demonstrate that the TS network is indeed more expressive and consistently learns faster than standard ReLU networks. ...
We present a novel neural network algorithm, the Tensor Switching (TS) network, which generalizes the Rectified Linear Unit (ReLU) nonlinearity to tensor-valued hidden units. ...
A deep SS-ReLU network with L layers may then be expressed as a sequence of alternating expansion and contraction steps, X L = X 0 ⊕ W 1 W 1 · · · ⊕ W L W L . (2) To obtain the deep TS-ReLU network, we ...
arXiv:1610.10087v1
fatcat:3ymtnzewjfa7xfa7cvnzkb3ese
Provable defenses against adversarial examples via the convex outer adversarial polytope
[article]
2018
arXiv
pre-print
We propose a method to learn deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations on the training data. ...
case loss over this outer region (via a linear program). ...
for a ReLU network. ...
arXiv:1711.00851v3
fatcat:u6dxtu4rtjg6rlywbtvqtcwe2u
Deep Component Analysis via Alternating Direction Neural Networks
[article]
2018
arXiv
pre-print
For inference, we propose a differentiable optimization algorithm implemented using recurrent Alternating Direction Neural Networks (ADNNs) that enable parameter learning using standard backpropagation ...
On the other hand, shallow representation learning with component analysis is associated with rich intuition and theory, but smaller capacity often limits its usefulness. ...
Deep Neural Networks Recently, deep neural networks have emerged as the preferred alternative to component analysis for representation learning of visual data. ...
arXiv:1803.06407v1
fatcat:tfivbuxbvbfc5lepgeglb5gpru
Unified Depth Prediction and Intrinsic Image Decomposition from a Single Image via Joint Convolutional Neural Fields
[chapter]
2016
Lecture Notes in Computer Science
there exists greater correlation between depth and intrinsic images, and the incorporation of a gradient scale network that learns the confidence of estimated gradients in order to effectively balance ...
The two tasks are formulated in a synergistic manner through a joint conditional random field (CRF) that is solved using a novel convolutional neural network (CNN) architecture, called the joint convolutional ...
The energy function can thus be minimized by alternating among its terms. ...
doi:10.1007/978-3-319-46484-8_9
fatcat:sd3r2qewvbcllcj2zvzuryq5hy
Unified Depth Prediction and Intrinsic Image Decomposition from a Single Image via Joint Convolutional Neural Fields
[article]
2016
arXiv
pre-print
there exists greater correlation between depth and intrinsic images, and the incorporation of a gradient scale network that learns the confidence of estimated gradients in order to effectively balance ...
The two tasks are formulated in a synergistic manner through a joint conditional random field (CRF) that is solved using a novel convolutional neural network (CNN) architecture, called the joint convolutional ...
The energy function can thus be minimized by alternating among its terms. ...
arXiv:1603.06359v1
fatcat:zmxww7ionncfxltcxelggbzm2q
Learning Stable Deep Dynamics Models
[article]
2020
arXiv
pre-print
Deep networks are commonly used to model dynamical systems, predicting how the state of a system will evolve over time (either autonomously or in response to control inputs). ...
The approach works by jointly learning a dynamics model and Lyapunov function that guarantees non-expansiveness of the dynamics under the learned Lyapunov function. ...
techniques such as alternating minimization need to be employed instead. ...
arXiv:2001.06116v1
fatcat:5ixd3au4wjhplhifpwab7vtooy
Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons
[article]
2018
arXiv
pre-print
In this paper, we propose a knowledge transfer method via distillation of activation boundaries formed by hidden neurons. ...
By the proposed method, the student learns a separating boundary between activation region and deactivation region formed by each neuron in the teacher. ...
The alternative loss is an approximation of the activation transfer loss which can be minimized with gradient descent. ...
arXiv:1811.03233v2
fatcat:c3mcd6k2jbesfi4awd7jpfvpdu
Generating Accurate Pseudo-labels in Semi-Supervised Learning and Avoiding Overconfident Predictions via Hermite Polynomial Activations
[article]
2020
arXiv
pre-print
Further, we show via theoretical analysis, that the networks (with Hermite activations) offer robustness to noise and other attractive mathematical properties. ...
Motivated by some of these results, we explore the use of Hermite polynomial expansions as a substitute for ReLUs in deep networks. ...
We thank one of the reviewers for pointing out a promising connection to meta-learning that will be pursued in follow-up work. ...
arXiv:1909.05479v2
fatcat:sn6xcuk5lnhf5dhfciq4g4di5a
Exploring Beyond-Demonstrator via Meta Learning-Based Reward Extrapolation
[article]
2022
arXiv
pre-print
Extrapolating beyond-demonstrator (BD) performance through the imitation learning (IL) algorithm aims to learn from and subsequently outperform the demonstrator. ...
To that end, a representative approach is to leverage inverse reinforcement learning (IRL) to infer a reward function from demonstrations before performing RL on the learned reward function. ...
Module Policy network Value network Input States States 8×8 Conv 32, ReLU 8×8 Conv 32, ReLU 4×4 Conv 64, ReLU 4×4 Conv 64, ReLU Arch. 3×3 Conv 32, ReLU Flatten 3×3 Conv 32, ReLU Flatten Dense 512 Dense ...
arXiv:2102.02454v12
fatcat:shkdaouae5btriteywqc3447zy
« Previous
Showing results 1 — 15 out of 15,951 results