28,034 Hits in 4.9 sec

Class-Similarity Based Label Smoothing for Confidence Calibration [article]

Chihuang Liu, Joseph JaJa
2021 arXiv   pre-print
This motivates the development of a new smooth label where the label values are based on similarities with the reference class.  ...  Generating confidence calibrated outputs is of utmost importance for the applications of deep neural networks in safety-critical decision-making systems.  ...  Motivated by directly optimizing the objective of confidence calibration, we propose class-similarity based label smoothing.  ... 
arXiv:2006.14028v2 fatcat:hijs4z2gyzczlap2jeclxkk3p4

Instance-based Label Smoothing for Better Classifier Calibration

Mohamed Maher
Additionally, we propose a new instance-based label smoothing method for logistic regression fitting.  ...  In previous works, it was shown that label smoothing has a positive calibration and generalization effect on the network predictions.  ...  label smoothing (Instance-based label smoothing).  ... 
doi:10.6084/m9.figshare.12902501.v1 fatcat:wtr73nmljnbb5lxqbikxg3svpa

Class-Distribution-Aware Calibration for Long-Tailed Visual Recognition [article]

Mobarakol Islam, Lalithkumar Seenivasan, Hongliang Ren, Ben Glocker
2021 arXiv   pre-print
However, the use of uniform TS or LS factor may not be optimal for calibrating models trained on a long-tailed dataset where the model produces overly confident probabilities for high-frequency classes  ...  Recent techniques like temperature scaling (TS) and label smoothing (LS) show effectiveness in obtaining a well-calibrated model by smoothing logits and hard labels with scalar factors, respectively.  ...  Class-Distribution-Aware LS Label smoothing (LS) limits the network to produce overly confident predictions by squeezing the true onehot label in CE loss calculation.  ... 
arXiv:2109.05263v1 fatcat:klfnmzsokbdwpf7yehqe6ryjli

Instance-based Label Smoothing For Better Calibrated Classification Networks [article]

Mohamed Maher, Meelis Kull
2021 arXiv   pre-print
Label smoothing is widely used in deep neural networks for multi-class classification.  ...  Our methods show better generalization and calibration over standard label smoothing on various deep neural architectures and image classification datasets.  ...  Label smoothing was found to distort the similarity information among classes, which affects the model's class-wise calibration and distillation performance.  ... 
arXiv:2110.05355v1 fatcat:mzzw6msoprf4vpc2phc2fyeqrq

When Does Label Smoothing Help? [article]

Rafael Müller, Simon Kornblith, Geoffrey Hinton
2020 arXiv   pre-print
This results in loss of information in the logits about resemblances between instances of different classes, which is necessary for distillation, but does not hurt generalization or calibration of the  ...  Here we show empirically that in addition to improving generalization, label smoothing improves model calibration which can significantly improve beam-search.  ...  Acknowledgements We would like to thank Mohammad Norouzi, William Chan, Kevin Swersky, Danijar Hafner and Rishabh Agrawal for the discussions and suggestions.  ... 
arXiv:1906.02629v3 fatcat:ycs7piaxyfcyjcoadnenbnixym

Improving Calibration through the Relationship with Adversarial Robustness [article]

Yao Qin, Xuezhi Wang, Alex Beutel, Ed H. Chi
2021 arXiv   pre-print
labels for an example based on how easily it can be attacked by an adversary.  ...  To this end, we propose Adversarial Robustness based Adaptive Label Smoothing (AR-AdaLS) that integrates the correlations of adversarial robustness and calibration into training by adaptively softening  ...  To this end, we propose a method named Adversarial Robustness based Adaptive Label Smoothing (AR-AdaLS), which performs label smoothing at different degrees to the training data based on their adversarial  ... 
arXiv:2006.16375v2 fatcat:6iu6ye6znvdwdcm3vibey6u5qi

On-manifold Adversarial Data Augmentation Improves Uncertainty Calibration [article]

Kanil Patel, William Beluch, Dan Zhang, Michael Pfeiffer, Bin Yang
2021 arXiv   pre-print
Variants of OMADA can employ different sampling schemes for ambiguous on-manifold examples based on the entropy of their estimated soft labels, which exhibit specific strengths for generalization, calibration  ...  adversarial attack path in the latent space of an autoencoder-based generative model that closely approximates decision boundaries between two or more classes.  ...  Soft labels were also used to improve generalization via -smoothing [17] , where a probability mass of size is distributed over all but the correct class, thus penalizing over-confident predictions.  ... 
arXiv:1912.07458v5 fatcat:dnebtyxpovc3jmlvc5h3esciey

On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks [article]

Sunil Thulasidasan, Gopinath Chennupati, Jeff Bilmes, Tanmoy Bhattacharya, Sarah Michalak
2020 arXiv   pre-print
Additionally, we find that merely mixing features does not result in the same calibration benefit and that the label smoothing in mixup training plays a significant role in improving calibration.  ...  Mixup is a recently proposed method for training deep neural networks where additional samples are generated during training by convexly combining random pairs of images and their associated labels.  ...  Acknowledgments We would like to thank the anonymous referees for their valuable suggestions for improving the paper.  ... 
arXiv:1905.11001v5 fatcat:wdkf3rlymvhiblwdazbea3ja64

Spatially Varying Label Smoothing: Capturing Uncertainty from Expert Annotations [article]

Mobarakol Islam, Ben Glocker
2021 arXiv   pre-print
We built upon label smoothing (LS) where a network is trained on 'blurred' versions of the ground truth labels which has been shown to be effective for calibrating output predictions.  ...  However, LS is not taking the local structure into account and results in overly smoothed predictions with low confidence even for non-ambiguous regions.  ...  There is evidence that strategies such as label smoothing (LS) [20, 25, 29] and temperature scaling [10, 15, 16] are useful for calibration and uncertainty quantification for independent class prediction  ... 
arXiv:2104.05788v1 fatcat:jqu4b6dn3ngs7awpplwh6zunce

Improving Unsupervised Image Clustering With Robust Learning [article]

Sungwon Park, Sungwon Han, Sundong Kim, Danu Kim, Sungkyu Park, Seunghoon Hong, Meeyoung Cha
2020 arXiv   pre-print
Extensive experiments show that the proposed model can adjust the model confidence with better calibration and gain additional robustness against adversarial noise.  ...  RUC's novelty is at utilizing pseudo-labels of existing image clustering models as a noisy dataset that may include misclassified samples.  ...  We guide the semi-supervised class assignment with robust learning techniques, such as co-training and label smoothing, to account for inherent label noises.  ... 
arXiv:2012.11150v1 fatcat:wdrsahqrkrfljaszxu3dlryyku

Revisiting Calibration for Question Answering [article]

Chenglei Si, Chen Zhao, Sewon Min, Jordan Boyd-Graber
2022 arXiv   pre-print
We examine various conventional calibration methods including temperature scaling, feature-based classifier, neural answer reranking, and label smoothing, all of which do not bring significant gains under  ...  For example, after conventional temperature scaling, confidence scores become similar for all predictions, which makes it hard for users to distinguish correct predictions from wrong ones, even though  ...  In label smoothing, we assign the gold label with probability mass α, and the rest classes 1−α |Y |−1 .  ... 
arXiv:2205.12507v1 fatcat:vl5mpwasnjbwbnp2juhf63wmue

Learning with Retrospection [article]

Xiang Deng, Zhongfei Zhang
2020 arXiv   pre-print
Extensive experiments on several benchmark datasets demonstrate the superiority of LWR for training DNNs.  ...  LWR is a simple yet effective training framework to improve accuracies, calibration, and robustness of DNNs without introducing any additional network parameters or inference cost, only with a negligible  ...  In a one-hot label, the probability (confidence score) for the ground-truth class is set to 1 while the probabilities for the other classes are all 0s, which means that the labels for different classes  ... 
arXiv:2012.13098v1 fatcat:cvzot3nbsrh5jibvtfetmuh224

A Comparative Study of Confidence Calibration in Deep Learning: From Computer Vision to Medical Imaging [article]

Riqiang Gao, Thomas Li, Yucheng Tang, Zhoubing Xu, Michael Kammer, Sanja L. Antic, Kim Sandler, Fabien Moldonado, Thomas A. Lasko, Bennett Landman
2022 arXiv   pre-print
vision tasks and medical imaging prediction, e.g., calibration methods ideal for general computer vision tasks may in fact damage the calibration of medical imaging prediction. (3) We also reinforce previous  ...  can lead to under-confident prediction, and simpler calibration models from the computer vision domain tend to be more generalizable to medical imaging. (2) We highlight the gap between general computer  ...  This project was supported in part by the National Center for Research Resources, Grant UL1 RR024975-01, and is now at the National Center for Advancing Translational Sciences, Grant 2 UL1 TR000445-06.  ... 
arXiv:2206.08833v1 fatcat:xztdblt6xfffpfloejhvkc7lce

Self-Distillation as Instance-Specific Label Smoothing [article]

Zhilu Zhang, Mert R. Sabuncu
2020 arXiv   pre-print
Finally, we propose a novel instance-specific label smoothing technique that promotes predictive diversity without the need for a separately trained teacher model.  ...  It has been recently demonstrated that multi-generational self-distillation can improve generalization. Despite this intriguing observation, reasons for the enhancement remain poorly understood.  ...  Similar observations were also made when label smoothing is applied [27] .  ... 
arXiv:2006.05065v2 fatcat:57ntxt2fsncw3lgmjju5hsuzeq

Knowledge distillation from language model to acoustic model: a hierarchical multi-task learning approach [article]

Mun-Hak Lee, Joon-Hyuk Chang
2021 arXiv   pre-print
label-interpolation-based distillation method.  ...  We propose an acoustic model structure with multiple auxiliary output layers for cross-modal distillation and demonstrate that the proposed method effectively compensates for the shortcomings of the existing  ...  (b) This is the case calibrated using the label smoothing method, and the label smoothing method only generates a well calibrated posterior probability for the 1 best class (top), but tends to be under-confident  ... 
arXiv:2110.10429v1 fatcat:oo64s4donjgvrfspqw3bvutooq
« Previous Showing results 1 — 15 out of 28,034 results