IA Scholar Query: Vladimir Vovk
https://scholar.archive.org/
Internet Archive Scholar query results feedeninfo@archive.orgFri, 16 Sep 2022 00:00:00 GMTfatcat-scholarhttps://scholar.archive.org/help1440Conformal prediction beyond exchangeability
https://scholar.archive.org/work/q6bopqwzzvfnfchtqpfopgzdea
Conformal prediction is a popular, modern technique for providing valid predictive inference for arbitrary machine learning models. Its validity relies on the assumptions of exchangeability of the data, and symmetry of the given model fitting algorithm as a function of the data. However, exchangeability is often violated when predictive models are deployed in practice. For example, if the data distribution drifts over time, then the data points are no longer exchangeable; moreover, in such settings, we might want to use a nonsymmetric algorithm that treats recent observations as more relevant. This paper generalizes conformal prediction to deal with both aspects: we employ weighted quantiles to introduce robustness against distribution drift, and design a new randomization technique to allow for algorithms that do not treat data points symmetrically. Our new methods are provably robust, with substantially less loss of coverage when exchangeability is violated due to distribution drift or other challenging features of real data, while also achieving the same coverage guarantees as existing conformal prediction methods if the data points are in fact exchangeable. We demonstrate the practical utility of these new tools with simulations and real-data experiments on electricity and election forecasting.Rina Foygel Barber, Emmanuel J. Candes, Aaditya Ramdas, Ryan J. Tibshiraniwork_q6bopqwzzvfnfchtqpfopgzdeaFri, 16 Sep 2022 00:00:00 GMTBregman Deviations of Generic Exponential Families
https://scholar.archive.org/work/entptbnxxreihbt55geqe34zea
We revisit the method of mixture technique, also known as the Laplace method, to study the concentration phenomenon in generic exponential families. Combining the properties of Bregman divergence associated with log-partition function of the family with the method of mixtures for super-martingales, we establish a generic bound controlling the Bregman divergence between the parameter of the family and a finite sample estimate of the parameter. Our bound is time-uniform and makes appear a quantity extending the classical information gain to exponential families, which we call the Bregman information gain. For the practitioner, we instantiate this novel bound to several classical families, e.g., Gaussian, Bernoulli, Exponential, Weibull, Pareto, Poisson and Chi-square yielding explicit forms of the confidence sets and the Bregman information gain. We further numerically compare the resulting confidence bounds to state-of-the-art alternatives for time-uniform concentration and show that this novel method yields competitive results. Finally, we highlight the benefit of our concentration bounds on some illustrative applications.Sayak Ray Chowdhury, Patrick Saux, Odalric-Ambrym Maillard, Aditya Gopalanwork_entptbnxxreihbt55geqe34zeaWed, 14 Sep 2022 00:00:00 GMTConformal Methods for Quantifying Uncertainty in Spatiotemporal Data: A Survey
https://scholar.archive.org/work/d3yhexoldbawja4ikzf7wa467i
Machine learning methods are increasingly widely used in high-risk settings such as healthcare, transportation, and finance. In these settings, it is important that a model produces calibrated uncertainty to reflect its own confidence and avoid failures. In this paper we survey recent works on uncertainty quantification (UQ) for deep learning, in particular distribution-free Conformal Prediction method for its mathematical properties and wide applicability. We will cover the theoretical guarantees of conformal methods, introduce techniques that improve calibration and efficiency for UQ in the context of spatiotemporal data, and discuss the role of UQ in the context of safe decision making.Sophia Sunwork_d3yhexoldbawja4ikzf7wa467iThu, 08 Sep 2022 00:00:00 GMTUncertainty Sets for Image Classifiers using Conformal Prediction
https://scholar.archive.org/work/o5l4dtmonjdgpatchs2vhkethq
Convolutional image classifiers can achieve high predictive accuracy, but quantifying their uncertainty remains an unresolved challenge, hindering their deployment in consequential settings. Existing uncertainty quantification techniques, such as Platt scaling, attempt to calibrate the network's probability estimates, but they do not have formal guarantees. We present an algorithm that modifies any classifier to output a predictive set containing the true label with a user-specified probability, such as 90%. The algorithm is simple and fast like Platt scaling, but provides a formal finite-sample coverage guarantee for every model and dataset. Our method modifies an existing conformal prediction algorithm to give more stable predictive sets by regularizing the small scores of unlikely classes after Platt scaling. In experiments on both Imagenet and Imagenet-V2 with ResNet-152 and other classifiers, our scheme outperforms existing approaches, achieving coverage with sets that are often factors of 5 to 10 smaller than a stand-alone Platt scaling baseline.Anastasios Angelopoulos, Stephen Bates, Jitendra Malik, Michael I. Jordanwork_o5l4dtmonjdgpatchs2vhkethqSat, 03 Sep 2022 00:00:00 GMTA Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification
https://scholar.archive.org/work/boxe5mcrsng7romcegxrh76lym
Black-box machine learning models are now routinely used in high-risk settings, like medical diagnostics, which demand uncertainty quantification to avoid consequential model failures. Conformal prediction is a user-friendly paradigm for creating statistically rigorous uncertainty sets/intervals for the predictions of such models. Critically, the sets are valid in a distribution-free sense: they possess explicit, non-asymptotic guarantees even without distributional assumptions or model assumptions. One can use conformal prediction with any pre-trained model, such as a neural network, to produce sets that are guaranteed to contain the ground truth with a user-specified probability, such as 90%. It is easy-to-understand, easy-to-use, and general, applying naturally to problems arising in the fields of computer vision, natural language processing, deep reinforcement learning, and so on. This hands-on introduction is aimed to provide the reader a working understanding of conformal prediction and related distribution-free uncertainty quantification techniques with one self-contained document. We lead the reader through practical theory for and examples of conformal prediction and describe its extensions to complex machine learning tasks involving structured outputs, distribution shift, time-series, outliers, models that abstain, and more. Throughout, there are many explanatory illustrations, examples, and code samples in Python. With each code sample comes a Jupyter notebook implementing the method on a real-data example; the notebooks can be accessed and easily run using our codebase.Anastasios N. Angelopoulos, Stephen Bateswork_boxe5mcrsng7romcegxrh76lymSat, 03 Sep 2022 00:00:00 GMTEstimating means of bounded random variables by betting
https://scholar.archive.org/work/56h4hjpcfjfttbkukfsbx7qii4
This paper derives confidence intervals (CI) and time-uniform confidence sequences (CS) for the classical problem of estimating an unknown mean from bounded observations. We present a general approach for deriving concentration bounds, that can be seen as a generalization and improvement of the celebrated Chernoff method. At its heart, it is based on a class of composite nonnegative martingales, with strong connections to testing by betting and the method of mixtures. We show how to extend these ideas to sampling without replacement, another heavily studied problem. In all cases, our bounds are adaptive to the unknown variance, and empirically vastly outperform existing approaches based on Hoeffding or empirical Bernstein inequalities and their recent supermartingale generalizations. In short, we establish a new state-of-the-art for four fundamental problems: CSs and CIs for bounded means, when sampling with and without replacement.Ian Waudby-Smith, Aaditya Ramdaswork_56h4hjpcfjfttbkukfsbx7qii4Thu, 25 Aug 2022 00:00:00 GMTA Maximum Entropy Copula Model for Mixed Data: Representation, Estimation, and Applications
https://scholar.archive.org/work/icxswhhwffhybnvincp36jb5py
A new nonparametric model of maximum-entropy (MaxEnt) copula density function is proposed, which offers the following advantages: (i) it is valid for mixed random vector. By 'mixed' we mean the method works for any combination of discrete or continuous variables in a fully automated manner; (ii) it yields a bonafide density estimate with intepretable parameters. By 'bonafide' we mean the estimate guarantees to be a non-negative function, integrates to 1; and (iii) it plays a unifying role in our understanding of a large class of statistical methods. Our approach utilizes modern machinery of nonparametric statistics to represent and approximate log-copula density function via LP-Fourier transform. Several real-data examples are also provided to explore the key theoretical and practical implications of the theory.Subhadeep Mukhopadhyaywork_icxswhhwffhybnvincp36jb5pyMon, 22 Aug 2022 00:00:00 GMTEfficiency of nonparametric e-tests
https://scholar.archive.org/work/owuak2wlwfcvrk7x3xp37wjipa
The notion of an e-value has been recently proposed as a possible alternative to critical regions and p-values in statistical hypothesis testing. In this note we introduce a simple analogue for e-values of Pitman's asymptotic relative efficiency and apply it to three popular nonparametric tests.Vladimir Vovk, Ruodu Wangwork_owuak2wlwfcvrk7x3xp37wjipaThu, 18 Aug 2022 00:00:00 GMTDiffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme
https://scholar.archive.org/work/tyhisv7a7zdmpfxk4hpx5scs6y
Voice conversion is a common speech synthesis task which can be solved in different ways depending on a particular real-world scenario. The most challenging one often referred to as one-shot many-to-many voice conversion consists in copying the target voice from only one reference utterance in the most general case when both source and target speakers do not belong to the training dataset. We present a scalable high-quality solution based on diffusion probabilistic modeling and demonstrate its superior quality compared to state-of-the-art one-shot voice conversion approaches. Moreover, focusing on real-time applications, we investigate general principles which can make diffusion models faster while keeping synthesis quality at a high level. As a result, we develop a novel Stochastic Differential Equations solver suitable for various diffusion model types and generative tasks as shown through empirical studies and justify it by theoretical analysis.Vadim Popov, Ivan Vovk, Vladimir Gogoryan, Tasnima Sadekova, Mikhail Kudinov, Jiansheng Weiwork_tyhisv7a7zdmpfxk4hpx5scs6yThu, 04 Aug 2022 00:00:00 GMTConformal Risk Control
https://scholar.archive.org/work/dibytxezwrb6jpifewaashrljm
We extend conformal prediction to control the expected value of any monotone loss function. The algorithm generalizes split conformal prediction together with its coverage guarantee. Like conformal prediction, the conformal risk control procedure is tight up to an 𝒪(1/n) factor. Worked examples from computer vision and natural language processing demonstrate the usage of our algorithm to bound the false negative rate, graph distance, and token-level F1-score.Anastasios N. Angelopoulos and Stephen Bates and Adam Fisch and Lihua Lei and Tal Schusterwork_dibytxezwrb6jpifewaashrljmThu, 04 Aug 2022 00:00:00 GMTTight Concentrations and Confidence Sequences from the Regret of Universal Portfolio
https://scholar.archive.org/work/rxt225l2arge5fzgknmxtlaoce
A classic problem in statistics is the estimation of the expectation of random variables from samples. This gives rise to the tightly connected problems of deriving concentration inequalities and confidence sequences, that is confidence intervals that hold uniformly over time. Previous work has shown how to easily convert the regret guarantee of an online betting algorithm into a time-uniform concentration inequality. In this paper, we show that we can go even further: We show that the regret of universal portfolio algorithms give rise to new implicit time-uniform concentrations and state-of-the-art empirically calculated confidence sequences. In particular, our numerically obtained confidence sequences can never be vacuous, even with a single sample, and satisfy the law of iterated logarithm.Francesco Orabona, Kwang-Sung Junwork_rxt225l2arge5fzgknmxtlaoceSun, 31 Jul 2022 00:00:00 GMTCuestiones Políticas, Volumen 40, Número 73, Julio-Diciembre de 2022
https://scholar.archive.org/work/udweeqbtvzeuvbjrvmrb3sd2rq
Universidad del ZuliaJorge Villasmilwork_udweeqbtvzeuvbjrvmrb3sd2rqFri, 29 Jul 2022 00:00:00 GMTConformal Prediction: a Unified Review of Theory and New Challenges
https://scholar.archive.org/work/nc74llid4nd2fjqldxd24ot54e
In this work we provide a review of basic ideas and novel developments about Conformal Prediction -- an innovative distribution-free, non-parametric forecasting method, based on minimal assumptions -- that is able to yield in a very straightforward way predictions sets that are valid in a statistical sense also in in the finite sample case. The in-depth discussion provided in the paper covers the theoretical underpinnings of Conformal Prediction, and then proceeds to list the more advanced developments and adaptations of the original idea.Matteo Fontana, Gianluca Zeni, Simone Vantiniwork_nc74llid4nd2fjqldxd24ot54eFri, 29 Jul 2022 00:00:00 GMTConformal Sensitivity Analysis for Individual Treatment Effects
https://scholar.archive.org/work/6h5zmsdzebgopb2eqy6v5i6knm
Estimating an individual treatment effect (ITE) is essential to personalized decision making. However, existing methods for estimating the ITE often rely on unconfoundedness, an assumption that is fundamentally untestable with observed data. To assess the robustness of individual-level causal conclusion with unconfoundedness, this paper proposes a method for sensitivity analysis of the ITE, a way to estimate a range of the ITE under unobserved confounding. The method we develop quantifies unmeasured confounding through a marginal sensitivity model [Ros2002, Tan2006], and adapts the framework of conformal inference to estimate an ITE interval at a given confounding strength. In particular, we formulate this sensitivity analysis problem as a conformal inference problem under distribution shift, and we extend existing methods of covariate-shifted conformal inference to this more general setting. The result is a predictive interval that has guaranteed nominal coverage of the ITE, a method that provides coverage with distribution-free and nonasymptotic guarantees. We evaluate the method on synthetic data and illustrate its application in an observational study.Mingzhang Yin, Claudia Shi, Yixin Wang, David M. Bleiwork_6h5zmsdzebgopb2eqy6v5i6knmTue, 12 Jul 2022 00:00:00 GMTImproving Trustworthiness of AI Disease Severity Rating in Medical Imaging with Ordinal Conformal Prediction Sets
https://scholar.archive.org/work/uzrqhxxm3rcs5jjp6jfkdrx4ka
The regulatory approval and broad clinical deployment of medical AI have been hampered by the perception that deep learning models fail in unpredictable and possibly catastrophic ways. A lack of statistically rigorous uncertainty quantification is a significant factor undermining trust in AI results. Recent developments in distribution-free uncertainty quantification present practical solutions for these issues by providing reliability guarantees for black-box models on arbitrary data distributions as formally valid finite-sample prediction intervals. Our work applies these new uncertainty quantification methods -- specifically conformal prediction -- to a deep-learning model for grading the severity of spinal stenosis in lumbar spine MRI. We demonstrate a technique for forming ordinal prediction sets that are guaranteed to contain the correct stenosis severity within a user-defined probability (confidence interval). On a dataset of 409 MRI exams processed by the deep-learning model, the conformal method provides tight coverage with small prediction set sizes. Furthermore, we explore the potential clinical applicability of flagging cases with high uncertainty predictions (large prediction sets) by quantifying an increase in the prevalence of significant imaging abnormalities (e.g. motion artifacts, metallic artifacts, and tumors) that could degrade confidence in predictive performance when compared to a random sample of cases.Charles Lu, Anastasios N. Angelopoulos, Stuart Pomerantzwork_uzrqhxxm3rcs5jjp6jfkdrx4kaTue, 05 Jul 2022 00:00:00 GMTInductive Conformal Prediction: A Straightforward Introduction with Examples in Python
https://scholar.archive.org/work/2grnup2jybcrjp3qcyvt73nv44
Inductive Conformal Prediction (ICP) is a set of distribution-free and model agnostic algorithms devised to predict with a user-defined confidence with coverage guarantee. Instead of having point predictions, i.e., a real number in the case of regression or a single class in multi class classification, models calibrated using ICP output an interval or a set of classes, respectively. ICP takes special importance in high-risk settings where we want the true output to belong to the prediction set with high probability. As an example, a classification model might output that given a magnetic resonance image a patient has no latent diseases to report. However, this model output was based on the most likely class, the second most likely class might tell that the patient has a 15% chance of brain tumor or other severe disease and therefore further exams should be conducted. Using ICP is therefore way more informative and we believe that should be the standard way of producing forecasts. This paper is a hands-on introduction, this means that we will provide examples as we introduce the theory.Martim Sousawork_2grnup2jybcrjp3qcyvt73nv44Fri, 01 Jul 2022 00:00:00 GMTMachine Learning for Probabilistic Prediction
https://scholar.archive.org/work/cmab4kf7uvcnxhk64eb6h7jj2q
Prediction is the key objective of many machine learning applications. Accurate, reliable and robust predictions are essential for optimal and fair decisions by downstream components of artificial intelligence systems, especially in high-stakes applications, such as personalised health, self-driving cars, finance, new drug development, forecasting of election outcomes and pandemics. Manymodernmachinelearning algorithms output overconfident predictions, resultinginincorrectdecisionsandtechnologyacceptanceissues. Classicalcalibrationmethods rely on artificial assumptions and often result in overfitting, whilst modern calibration methods attempt to solve calibration issues by modifying components of black-box deeplearning systems. While this provides a partial solution, such modifications do not provide mathematical guarantees of predictions validity, are intrusive, complex, and costly to implement. This thesis introduces novel methods for producing well-calibrated probabilistic predictions for machine learning classification and regression problems. A new method for multi-class classification problems is developed and compared to traditional calibration approaches. In the regression setting, the thesis develops novel methods for probabilistic regression to derive predictive distribution functions that are valid under a nonparametricIIDassumptionintermsofguaranteedcoverageandcontainmoreinformation when compared to classical conformal prediction methods whilst improving computational efficiency. Experimental studies of the methods introduced in this thesis demonstrate advantages with regard to state-of-the-art. The main advantage of split conformal predictive systems is their guaranteed validity, whilst cross-conformal predictive systems enjoy higher predictive efficiency andempiricalvalidity in the absence ofexcess randomisation.Valery Manokhin, Vladimir Vovk, Alessio Sancettawork_cmab4kf7uvcnxhk64eb6h7jj2qFri, 24 Jun 2022 00:00:00 GMTMerging sequential e-values via martingales
https://scholar.archive.org/work/h7mteearcjfq3ncd6b4pb7zmvq
We study the problem of merging sequential or independent e-values into one e-value for statistical decision making. We describe a class of e-value merging functions via martingales, and show that all merging methods for sequential e-values are dominated by such a class. In case of merging independent e-values, the situation becomes much more sophisticated, and we provide a general class of such merging functions based on reordered test martingales.Vladimir Vovk, Ruodu Wangwork_h7mteearcjfq3ncd6b4pb7zmvqMon, 30 May 2022 00:00:00 GMTEfficient and Differentiable Conformal Prediction with General Function Classes
https://scholar.archive.org/work/23jtj5mqavb4la3osat2tmwolq
Quantifying the data uncertainty in learning tasks is often done by learning a prediction interval or prediction set of the label given the input. Two commonly desired properties for learned prediction sets are valid coverage and good efficiency (such as low length or low cardinality). Conformal prediction is a powerful technique for learning prediction sets with valid coverage, yet by default its conformalization step only learns a single parameter, and does not optimize the efficiency over more expressive function classes. In this paper, we propose a generalization of conformal prediction to multiple learnable parameters, by considering the constrained empirical risk minimization (ERM) problem of finding the most efficient prediction set subject to valid empirical coverage. This meta-algorithm generalizes existing conformal prediction algorithms, and we show that it achieves approximate valid population coverage and near-optimal efficiency within class, whenever the function class in the conformalization step is low-capacity in a certain sense. Next, this ERM problem is challenging to optimize as it involves a non-differentiable coverage constraint. We develop a gradient-based algorithm for it by approximating the original constrained ERM using differentiable surrogate losses and Lagrangians. Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly over existing approaches in several applications such as prediction intervals with improved length, minimum-volume prediction sets for multi-output regression, and label prediction sets for image classification.Yu Bai, Song Mei, Huan Wang, Yingbo Zhou, Caiming Xiongwork_23jtj5mqavb4la3osat2tmwolqSun, 29 May 2022 00:00:00 GMTDistribution-free inference for regression: discrete, continuous, and in between
https://scholar.archive.org/work/34sirkw4nnfuze6f2ftdtl2cby
In data analysis problems where we are not able to rely on distributional assumptions, what types of inference guarantees can still be obtained? Many popular methods, such as holdout methods, cross-validation methods, and conformal prediction, are able to provide distribution-free guarantees for predictive inference, but the problem of providing inference for the underlying regression function (for example, inference on the conditional mean 𝔼[Y|X]) is more challenging. In the setting where the features X are continuously distributed, recent work has established that any confidence interval for 𝔼[Y|X] must have non-vanishing width, even as sample size tends to infinity. At the other extreme, if X takes only a small number of possible values, then inference on 𝔼[Y|X] is trivial to achieve. In this work, we study the problem in settings in between these two extremes. We find that there are several distinct regimes in between the finite setting and the continuous setting, where vanishing-width confidence intervals are achievable if and only if the effective support size of the distribution of X is smaller than the square of the sample size.Yonghoon Lee, Rina Foygel Barberwork_34sirkw4nnfuze6f2ftdtl2cbySat, 28 May 2022 00:00:00 GMT