22,913 Hits in 5.1 sec

Adaptive Sampled Softmax with Kernel Based Sampling [article]

Guy Blanc, Steffen Rendle
2018 arXiv   pre-print
Kernel based sampling adapts to the model as it is trained, thus resulting in low bias.  ...  We empirically study the trade-off of bias, sampling distribution and sample size and show that kernel based sampling results in low bias with few samples.  ...  This verifies Final model quality when training a sampled softmax with different sampling distributions (uniform, quadratic, softmax) and number of samples, m.  ... 
arXiv:1712.00527v2 fatcat:zse6rxnr55celdvskajqzzyg34

Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation [article]

Zhenda Xie, Zheng Zhang, Xizhou Zhu, Gao Huang, Stephen Lin
2020 arXiv   pre-print
To circumvent this issue, we make use of a reparameterization trick based on the Gumbel-Softmax distribution, with which backpropagation can iterate these variables towards binary values.  ...  A technical challenge of this sampling-based approach is that the binary decision variables for representing discrete sampling locations are non-differentiable, making them incompatible with backpropagation  ...  Our sampling is based on the two-class Gumbel-Softmax distribution, which was first introduced in reinforcement learning [18] to simulate stochastic discrete sampling.  ... 
arXiv:2003.08866v4 fatcat:k26aymy4jrctro3yvw2xd7yaim

Template-Instance Loss for Offline Handwritten Chinese Character Recognition [article]

Yao Xiao, Dan Meng, Cewu Lu, Chi-Keung Tang
2019 arXiv   pre-print
First, the character template is designed to deal with the intrinsic similarities among Chinese characters.  ...  Trained with the new loss functions using our deep network architecture HCCR14Layer model consisting of simple layers, our extensive experiments show that it yields state-of-the-art performance and beyond  ...  We propose instance loss, which is achieved by adaptive margin based on the softmax loss.  ... 
arXiv:1910.05545v1 fatcat:cvdotavc2jarpj3k2h3xpbaejm

Fast Variational AutoEncoder with Inverted Multi-Index for Collaborative Filtering [article]

Jin Chen, Binbin Jin, Xu Huang, Defu Lian, Kai Zheng, Enhong Chen
2021 arXiv   pre-print
Importance sampling is an effective approximation method, based on which the sampled softmax has been derived.  ...  To this end, we propose to decompose the inner-product-based softmax probability based on the inverted multi-index, leading to sublinear-time and highly accurate sampling.  ...  Specifically, the dynamic sampler first draws samples uniformly from the item set and the item with the maximum rating is selected. • Kernel based sampler [6] is a recent method for adaptively sampled  ... 
arXiv:2109.05773v1 fatcat:za7g4ocey5ecvnufouwctk6ggu

Study of the Few-Shot Learning for ECG Classification Based on the PTB-XL Dataset

Krzysztof Pałczyński, Sandra Śmigiel, Damian Ledziński, Sławomir Bujnowski
2022 Sensors  
The results of the FSL network were compared with the evaluation score of the neural network performing softmax-based classification.  ...  The proposed network also achieved better results in classifying five different disease classes than softmax-based counterparts with an accuracy of 80.2–77.9% as opposed to 77.1% to 75.1%.  ...  ECG files come in two other options with 500 Hz and 100 Hz sampling rates with 16-bit resolution. The research used ECGs with 500 Hz sampling rates.  ... 
doi:10.3390/s22030904 pmid:35161650 pmcid:PMC8839938 fatcat:ieephn2lgfb4zi3p6c74cgyhyq

Last Layer Marginal Likelihood for Invariance Learning [article]

Pola Schwöbel, Martin Jørgensen, Sebastian W. Ober, Mark van der Wilk
2022 arXiv   pre-print
Traditionally, these are hand-crafted and tuned with cross validation.  ...  Computing the marginal likelihood is hard for neural networks, but success with tractable approaches that compute the marginal likelihood for the last layer only raises the question of whether this convenient  ...  While both likelihoods achieve similar test accuracies, we observe a 2.3× speedup per iteration in training for the sample-based Softmax over the Gaussian model.  ... 
arXiv:2106.07512v2 fatcat:5ojyoa62kra6jlw4hoaqnwqema

TDACNN: Target-domain-free Domain Adaptation Convolutional Neural Network for Drift Compensation in Gas Sensors [article]

Yuelin Zhang, Jia Yan, Zehuan Wang, Xiaoyan Peng, Yutong Tian, Shukai Duan
2021 arXiv   pre-print
To optimize network training, an additive angular margin softmax loss with parameter dynamic adjustment is utilized.  ...  To compensate for this, in this paper, deep learning based on a target-domain-free domain adaptation convolutional neural network (TDACNN) is proposed.  ...  Using the kernel function ( ) φ ⋅ , samples are mapped to a reproducing kernel Hilbert space (RKHS), and the MMD between source domain s and target domain t is defined as a set of kernel functions in  ... 
arXiv:2110.07509v2 fatcat:eytrbee53bgvnkcgtm3jemccme

Multi-Layer domain adaptation method for rolling bearing fault diagnosis

Xiang Li, Wei Zhang, Qian Ding, Jian-Qiao Sun
2019 Signal Processing  
Acknowledgements The material in this paper is based on work supported by grants (11172197, 11332008, and 11572215) from the National Science Foundation of China.  ...  The remainder of this paper starts with the theoretical background in Section 2. The domain adaptation problem, convolutional neural network, MMD and softmax classifier are introduced.  ...  On the other hand, for the Without-DA method without domain adaptation, while the samples with the same health condition labels cluster well using the same network architecture with the proposed method  ... 
doi:10.1016/j.sigpro.2018.12.005 fatcat:kb6pqprnszhqteey5epevajppy

Learning Graph-Level Representations with Recurrent Neural Networks [article]

Yu Jin, Joseph F. JaJa
2018 arXiv   pre-print
Graph nodes are mapped into node sequences sampled from random walk approaches approximated by the Gumbel-Softmax distribution.  ...  ADAM is used for optimization with the initial learning rate as 0.0001 with adaptive decay [33] .  ...  Following the standard practice, we use ADAM optimizer with initial learning rate as 0.0001 with adaptive decay [33] .  ... 
arXiv:1805.07683v4 fatcat:lsrbrfswtjejzcdbtggm77sa7y

Deep face recognition with clustering based domain adaptation [article]

Mei Wang, Weihong Deng
2022 arXiv   pre-print
In this paper, we propose a new clustering-based domain adaptation method designed for face recognition task in which the source and target domain do not share any classes.  ...  State-of-the-art performance of GBU data set is achieved by only unsupervised adaptation from the target training data.  ...  Finally, we can annotate all clustered nodes with pseudo label ŷt i , and adapt the network with supervision of Softmax loss. B.  ... 
arXiv:2205.13937v1 fatcat:2ttgx6ldsfchxnv6zieofrewui

Two-Stage Monte Carlo Denoising with Adaptive Sampling and Kernel Pool [article]

Tiange Xiang, Hongliang Yuan, Haozhi Huang, Yujin Shi
2021 arXiv   pre-print
In this paper, we tackle the problems in Monte Carlo rendering by proposing a two-stage denoiser based on the adaptive sampling strategy.  ...  In the first stage, concurrent to adjusting samples per pixel (spp) on-the-fly, we reuse the computations to generate extra denoising kernels applying on the adaptively rendered image.  ...  Per-pixel kernels are grid sampled from softmax(π) with the predicted kernel map h.  ... 
arXiv:2103.16115v1 fatcat:zy3a3xoxsndixf6xnm3cbpedji

Enhancing Collaborative and Geometric Multi-Kernel Learning Using Deep Neural Network

Bareera Zafar, Syed Abbas Zilqurnain Naqvi, Muhammad Ahsan, Allah Ditta, Ummul Baneen, Muhammad Adnan Khan
2022 Computers Materials & Continua  
CGMKL combines multiple kernel learning with softmax function using the framework of multi empirical kernel learning (MEKL) in which empirical kernel mapping (EKM) provides explicit feature construction  ...  CGMKL ensures the consistent output of samples across kernel spaces and minimizes the within-class distance to highlight geometric features of multiple classes.  ...  To improve the regular MKL, extended MKL techniques have been proposed e-g Localized MKL (LMKL), novel sample-wise alternating optimization for training LMKL [24] ; Sample Adaptive MKL, adaptively switch  ... 
doi:10.32604/cmc.2022.027874 fatcat:nwpoeh6oircovij64vqaia4fky

Locally Adaptive Learning Loss for Semantic Image Segmentation [article]

Jinjiang Guo, Pengyuan Ren, Aiguo Gu, Jian Xu, Weixin Wu
2018 arXiv   pre-print
to regional connections between adjacent pixels based on their categories.  ...  Stride by stride, our method firstly conducts adaptive pooling filter operating over predicted feature maps, aiming to merge predicted distributions over a small group of neighboring pixels with same category  ...  accuracy, compared with plain softmax cross-entropy.  ... 
arXiv:1802.08290v2 fatcat:aou5jrcjqjfdhk5n4q6fpsu5t4

Bayesian Few-Shot Classification with One-vs-Each Pólya-Gamma Augmented Gaussian Processes [article]

Jake Snell, Richard Zemel
2021 arXiv   pre-print
Instead, we propose a Gaussian process classifier based on a novel combination of P\'olya-Gamma augmentation and the one-vs-each softmax approximation that allows us to efficiently marginalize over functions  ...  Few-shot classification (FSC), the task of adapting a classifier to unseen classes given a small labeled dataset, is an important step on the path toward human-like machine learning.  ...  We experimented with several kernels and found the cosine and linear kernels to generally outperform RBF-based kernels (see Section E for detailed comparisons).  ... 
arXiv:2007.10417v2 fatcat:znnfs32gandwdn5cy3mc43fo74

Enhancing Transformation-based Defenses using a Distribution Classifier [article]

Connie Kou, Hwee Kuan Lee, Ee-Chien Chang, Teck Khim Ng
2020 arXiv   pre-print
Our method is generic and can be integrated with existing transformation-based defenses.  ...  With these observations, we propose a method to improve existing transformation-based defenses.  ...  For each image, we build the marginal distributions of the softmax for each class using kernel density estimation with a Gaussian kernel. The kernel width is optimized to be 0.05.  ... 
arXiv:1906.00258v2 fatcat:cbn3qnrdungodl74tvedgtloz4
« Previous Showing results 1 — 15 out of 22,913 results