A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Non-parametric Binary regression in metric spaces with KL loss
[article]
2020
arXiv
pre-print
We propose a non-parametric variant of binary regression, where the hypothesis is regularized to be a Lipschitz function taking a metric space to [0,1] and the loss is logarithmic. ...
We get around this challenge via an adaptive truncation approach, and also present a lower bound indicating that the truncation is, in some sense, necessary. ...
Non-parametric binary regression has been employed in a number of works. ...
arXiv:2010.09886v1
fatcat:uobmz43yjbhtpm5dwapgb3dfsi
Metric Gaussian Variational Inference
[article]
2020
arXiv
pre-print
We alternate between approximating the covariance with the inverse Fisher information metric evaluated at an intermediate mean estimate and optimizing the KL-divergence for the given covariance with respect ...
With this method we achieve higher accuracy and in many cases a significant speedup compared to traditional methods. ...
Here we discuss the problem of binary Gaussian process classification in two dimensions with non-parametric kernel estimation. The data consists of binary values with associated location. ...
arXiv:1901.11033v3
fatcat:4xth43f4mzaanir4rwr5hufq2i
Distribution Calibration for Regression
[article]
2019
arXiv
pre-print
We are concerned with obtaining well-calibrated output distributions from regression models. ...
We further propose a post-hoc approach to improving the predictions from previously trained regression models, using multi-output Gaussian Processes with a novel Beta link function. ...
Isotonic calibration is a powerful non-parametric method based on isotonic regression along with a simple iterative algorithm called Pool Adjacent Violators (PAV), which finds the train-optimal regression ...
arXiv:1905.06023v1
fatcat:u3kqvmyinngf5dpbosqp3f7y3y
Projective Latent Interventions for Understanding and Fine-tuning Classifiers
[article]
2020
arXiv
pre-print
PLIs allow domain experts to control the latent decision space in an intuitive way in order to better match their expectations. ...
The back-propagation is based on parametric approximations of t-distributed stochastic neighbourhood embeddings. ...
In the actual training phase, we calculate low-dimensional pairwise probabilities q ij for each input batch, and use the KL-divergence KL(p ij ||q ij ) as a loss function. ...
arXiv:2006.12902v2
fatcat:cfxtxadjgvf57pjxclpb5x2yai
Dynamic Model Selection for Prediction Under a Budget
[article]
2017
arXiv
pre-print
We pose an empirical loss minimization problem with cost constraints to jointly train gating and prediction models. ...
Then a low-complexity gating and prediction model are subsequently learnt to adaptively approximate the high-accuracy model in regions where low-cost models are capable of making highly accurate predictions ...
Acknowledgments Feng Nan would like to thank Dr Ofer Dekel for ideas and discussions on resource constrained machine learning during an internship in Microsoft Research in summer 2016. ...
arXiv:1704.07505v1
fatcat:fd55lguvbrdc3jwg6jklznxzya
GSPN: Generative Shape Proposal Network for 3D Instance Segmentation in Point Cloud
[article]
2018
arXiv
pre-print
Instead of treating object proposal as a direct bounding box regression problem, we take an analysis-by-synthesis strategy and generate proposals by reconstructing shapes from noisy observations in a scene ...
The success of GSPN largely comes from its emphasis on geometric understandings during object proposal, which greatly reducing proposals with low objectness. ...
Since we have parametrized q φ (z|x, c) and p θ (z|c) as N (µ z , σ 2 z ) and N (µ z , σ 2 z ) respectively through neural networks, the KL loss can be easily computed as: L KL = log σ z σ z + σ 2 z + ...
arXiv:1812.03320v1
fatcat:3ybr53c73zbxtk2wi2utb7sziu
Wasserstein regularization for sparse multi-task regression
[article]
2019
arXiv
pre-print
We focus in this paper on high-dimensional regression problems where each regressor can be associated to a location in a physical space, or more generally a generic geometric space. ...
In this paper, we propose a convex regularizer for multi-task regression that encodes a more flexible geometry. ...
Our work is one of them in the context of sparse high dimensionial regression tasks where regressors can be associated to a geometric space. ...
arXiv:1805.07833v3
fatcat:yrgyudpnlrfufbgpcbwn2s4ki4
Distribution Matching in Variational Inference
[article]
2019
arXiv
pre-print
In this paper, we expose the limitations of Variational Autoencoders (VAEs), which consistently fail to learn marginal distributions in both latent and visible spaces. ...
With the increasingly widespread deployment of generative models, there is a mounting need for a deeper understanding of their behaviors and limitations. ...
Leveraging binary classifiers to estimate KL divergences results in underestimated KL values, even when the discriminator is trained to optimality (see Figure 3 ). ...
arXiv:1802.06847v4
fatcat:j5mgvrxwsffxnpite5r33tunoe
Differentially Private Synthetic Mixed-Type Data Generation For Unsupervised Learning
[article]
2020
arXiv
pre-print
We implement this framework on both binary data (MIMIC-III) and mixed-type data (ADULT), and compare its performance with existing private algorithms on metrics in unsupervised settings. ...
We also introduce a new quantitative metric able to detect diversity, or lack thereof, of synthetic data. ...
For three continuous features in the ADULT dataset (capital gain, capital loss, and hours worked per week), we were not able to find a regression model with good fit (as measured by R 2 score) for the ...
arXiv:1912.03250v2
fatcat:lo6lugwudfgwjm6zze72bfrxcy
Federated Generalized Bayesian Learning via Distributed Stein Variational Gradient Descent
[article]
2021
arXiv
pre-print
This paper introduces Distributed Stein Variational Gradient Descent (DSVGD), a non-parametric generalized Bayesian inference framework for federated learning. ...
DSVGD is shown to compare favorably to benchmark frequentist and Bayesian federated learning strategies, also scheduling a single device per iteration, in terms of accuracy and scalability with respect ...
Log-likelihood for Bayesian logistic regression with non-iid data distributions (N = 6, L = L = 200). ...
arXiv:2009.06419v6
fatcat:2gn7h22tfjfc5pm5wu6eqopkjq
Minimax Rates for Conditional Density Estimation via Empirical Entropy
[article]
2021
arXiv
pre-print
for regression. ...
For joint density estimation, minimax rates have been characterized for general density classes in terms of uniform (metric) entropy, a well-studied notion of statistical capacity. ...
DMR is supported in part by an NSERC Discovery Grant and an Ontario Early Researcher Award. This material is based also upon work supported by the United States Air Force under Contract No. ...
arXiv:2109.10461v2
fatcat:dej5c5h3jjfkxjtu7wswsm6kdq
Deep Modeling of Growth Trajectories for Longitudinal Prediction of Missing Infant Cortical Surfaces
[article]
2020
arXiv
pre-print
Adopting a binary flag in loss calculation to deal with missing data, we fully utilize all available cortical surfaces for training our deep learning model, without requiring a complete collection of longitudinal ...
We will demonstrate with experimental results that our method is capable of capturing the nonlinearity of spatiotemporal cortical growth patterns and can predict cortical surfaces with improved accuracy ...
[19] , where they proposed a learning-based framework for predicting dynamic postnatal changes in the cortical shape based on the cortical surfaces at birth using varifold metric for surface regression ...
arXiv:2009.02797v2
fatcat:xmjy3czkffelzf5ky3rt6anm5a
Adversarial Robustness via Fisher-Rao Regularization
[article]
2022
arXiv
pre-print
some interesting properties as well as connections with standard regularization metrics. ...
Empirically, we evaluate the performance of various classifiers trained with the proposed loss on standard datasets, showing up to a simultaneous 1\% of improvement in terms of clean and robust performances ...
ACKNOWLEDGMENT This work was supported by the Natural Sciences and Engineering Research Council of Canada, and McGill University in the framework of the NSERC/Hydro-Quebec Industrial Research Chair in ...
arXiv:2106.06685v2
fatcat:346ven472ncwtfygvmdupon4ae
Geometric Losses for Distributional Learning
[article]
2019
arXiv
pre-print
Unlike previous attempts to use optimal transport distances for learning, our loss results in unconstrained convex objective functions, supports infinite (or very large) class spaces, and naturally defines ...
a metric or cost between classes. ...
We set the KL weight to 1, and rescale the KL loss with a factor h × w, to make its gradient of the same order as the one computed with separated binary cross entropy. ...
arXiv:1905.06005v1
fatcat:bp4o56snqnfkfol5lywlfvp3fy
DeepCoder: Semi-parametric Variational Autoencoders for Automatic Facial Action Coding
[article]
2017
arXiv
pre-print
By contrast, the non-parametric (probabilistic) approaches, such as Gaussian Processes (GPs), typically outperform their parametric counterparts, but cannot deal easily with large amounts of data. ...
To this end, we propose a novel VAE semi-parametric modeling framework, named DeepCoder, which combines the modeling power of parametric (convolutional) and nonparametric (ordinal GPs) VAEs, for joint ...
using non-parametric models. ...
arXiv:1704.02206v2
fatcat:fhqmrhkwz5evdglag26kx4dkgq
« Previous
Showing results 1 — 15 out of 2,341 results