IA Scholar Query: Power and limitations of conformal martingales.
https://scholar.archive.org/
Internet Archive Scholar query results feedeninfo@archive.orgMon, 03 Oct 2022 00:00:00 GMTfatcat-scholarhttps://scholar.archive.org/help1440Taming Fat-Tailed ("Heavier-Tailed" with Potentially Infinite Variance) Noise in Federated Learning
https://scholar.archive.org/work/bj4g5l23aja7nd7un4vmqfm7hy
A key assumption in most existing works on FL algorithms' convergence analysis is that the noise in stochastic first-order information has a finite variance. Although this assumption covers all light-tailed (i.e., sub-exponential) and some heavy-tailed noise distributions (e.g., log-normal, Weibull, and some Pareto distributions), it fails for many fat-tailed noise distributions (i.e., "heavier-tailed" with potentially infinite variance) that have been empirically observed in the FL literature. To date, it remains unclear whether one can design convergent algorithms for FL systems that experience fat-tailed noise. This motivates us to fill this gap in this paper by proposing an algorithmic framework called FAT-Clipping (federated averaging with two-sided learning rates and clipping), which contains two variants: FAT-Clipping per-round (FAT-Clipping-PR) and FAT-Clipping per-iteration (FAT-Clipping-PI). Specifically, for the largest α∈ (1,2] such that the fat-tailed noise in FL still has a bounded α-moment, we show that both variants achieve 𝒪((mT)^2-α/α) and 𝒪((mT)^1-α/3α-2) convergence rates in the strongly-convex and general non-convex settings, respectively, where m and T are the numbers of clients and communication rounds. Moreover, at the expense of more clipping operations compared to FAT-Clipping-PR, FAT-Clipping-PI further enjoys a linear speedup effect with respect to the number of local updates at each client and being lower-bound-matching (i.e., order-optimal). Collectively, our results advance the understanding of designing efficient algorithms for FL systems that exhibit fat-tailed first-order oracle information.Haibo Yang, Peiwen Qiu, Jia Liuwork_bj4g5l23aja7nd7un4vmqfm7hyMon, 03 Oct 2022 00:00:00 GMTImproved Algorithms for Neural Active Learning
https://scholar.archive.org/work/wlapgk7z65cbtnijnikvgaxcoa
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. In particular, we introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work. Then, the proposed algorithm leverages the powerful representation of NNs for both exploitation and exploration, has the query decision-maker tailored for k-class classification problems with the performance guarantee, utilizes the full feedback, and updates parameters in a more practical and efficient manner. These careful designs lead to a better regret upper bound, improving by a multiplicative factor O(log T) and removing the curse of both input dimensionality and the complexity of the function to be learned. Furthermore, we show that the algorithm can achieve the same performance as the Bayes-optimal classifier in the long run under the hard-margin setting in classification problems. In the end, we use extensive experiments to evaluate the proposed algorithm and SOTA baselines, to show the improved empirical performance.Yikun Ban, Yuheng Zhang, Hanghang Tong, Arindam Banerjee, Jingrui Hework_wlapgk7z65cbtnijnikvgaxcoaSun, 02 Oct 2022 00:00:00 GMTThe fuzzy Potts model in the plane: Scaling limits and arm exponents
https://scholar.archive.org/work/raboh4pju5cwzkk2fsmfgb5zqy
We study the fuzzy Potts model on a critical FK percolation in the plane, which is obtained by coloring the clusters of the percolation model independently at random. We show that under the assumption that this critical FK percolation model converges to a conformally invariant scaling limit (which is known to hold for the FK-Ising model), the obtained coloring converges to variants of Conformal Loop Ensembles constructed, described and studied by Miller, Sheffield and Werner. We also show, using discrete considerations that the arm exponents for this coloring in the discrete model are identical to the ones of the continuum model. Using the values of these arm exponents in the continuum, we determine the arm exponents for the fuzzy Potts model.Laurin Köhler-Schindler, Matthis Lehmkuehlerwork_raboh4pju5cwzkk2fsmfgb5zqyMon, 26 Sep 2022 00:00:00 GMTOn Multivariate Time-Varying Dynamic Models
https://scholar.archive.org/work/smnhrcsmprbsxfip26fsgneada
This dissertation consists of three chapters that contribute to different multivariate time series models with local stationarity; that is, the underlying data generating mechanism of the dynamic process is changing smoothly over time. The first chapter briefly reviews the literature. The second chapter considers a new class of time-varying vector moving average infinity processes. The third chapter introduces a new class of time-varying vector autoregression (VAR) models in which the VAR coefficients and covariance matrix of the error innovations are allowed to change smoothly over time. The fourth chapter considers a wide class of time-varying multivariate causal processes that nests many classic and new examples as special cases. Numerical studies are conducted to illustrate the usefulness of the proposed models and methods.YAYI YANwork_smnhrcsmprbsxfip26fsgneadaMon, 26 Sep 2022 00:00:00 GMTProximal Point Imitation Learning
https://scholar.archive.org/work/nkf7uknumrhjberzwfptwas64i
This work develops new algorithms with rigorous efficiency guarantees for infinite horizon imitation learning (IL) with linear function approximation without restrictive coherence assumptions. We begin with the minimax formulation of the problem and then outline how to leverage classical tools from optimization, in particular, the proximal-point method (PPM) and dual smoothing, for online and offline IL, respectively. Thanks to PPM, we avoid nested policy evaluation and cost updates for online IL appearing in the prior literature. In particular, we do away with the conventional alternating updates by the optimization of a single convex and smooth objective over both cost and Q-functions. When solved inexactly, we relate the optimization errors to the suboptimality of the recovered policy. As an added bonus, by re-interpreting PPM as dual smoothing with the expert policy as a center point, we also obtain an offline IL algorithm enjoying theoretical guarantees in terms of required expert trajectories. Finally, we achieve convincing empirical performance for both linear and neural network function approximation.Luca Viano and Angeliki Kamoutsi and Gergely Neu and Igor Krawczuk and Volkan Cevherwork_nkf7uknumrhjberzwfptwas64iThu, 22 Sep 2022 00:00:00 GMTConformal removability of SLE_4
https://scholar.archive.org/work/lsu5z5mw7rakpmq2kf6qa62srq
We consider the Schramm-Loewner evolution (SLE_κ) with κ=4, the critical value of κ > 0 at or below which SLE_κ is a simple curve and above which it is self-intersecting. We show that the range of an SLE_4 curve is a.s. conformally removable, answering a question posed by Sheffield. Such curves arise as the conformal welding of a pair of independent critical (γ=2) Liouville quantum gravity (LQG) surfaces along their boundaries and our result implies that this conformal welding is unique. In order to establish this result, we give a new sufficient condition for a set X ⊆𝐂 to be conformally removable which applies in the case that X is not necessarily the boundary of a simply connected domain.Konstantinos Kavvadias, Jason Miller, Lukas Schougwork_lsu5z5mw7rakpmq2kf6qa62srqWed, 21 Sep 2022 00:00:00 GMTAge of Semantics in Cooperative Communications: To Expedite Simulation Towards Real via Offline Reinforcement Learning
https://scholar.archive.org/work/u3453oazkjhxljmer25zk2iodi
The age of information metric fails to correctly describe the intrinsic semantics of a status update. In an intelligent reflecting surface-aided cooperative relay communication system, we propose the age of semantics (AoS) for measuring semantics freshness of the status updates. Specifically, we focus on the status updating from a source node (SN) to the destination, which is formulated as a Markov decision process (MDP). The objective of the SN is to maximize the expected satisfaction of AoS and energy consumption under the maximum transmit power constraint. To seek the optimal control policy, we first derive an online deep actor-critic (DAC) learning scheme under the on-policy temporal difference learning framework. However, implementing the online DAC in practice poses the key challenge in infinitely repeated interactions between the SN and the system, which can be dangerous particularly during the exploration. We then put forward a novel offline DAC scheme, which estimates the optimal control policy from a previously collected dataset without any further interactions with the system. Numerical experiments verify the theoretical results and show that our offline DAC scheme significantly outperforms the online DAC scheme and the most representative baselines in terms of mean utility, demonstrating strong robustness to dataset quality.Xianfu Chen and Zhifeng Zhao and Shiwen Mao and Celimuge Wu and Honggang Zhang and Mehdi Benniswork_u3453oazkjhxljmer25zk2iodiMon, 19 Sep 2022 00:00:00 GMTLiouville quantum gravity from random matrix dynamics
https://scholar.archive.org/work/pdh6oy6y4nh5zjpdnevsa26e4a
We establish the first connection between 2d Liouville quantum gravity and natural dynamics of random matrices. In particular, we show that if (U_t) is a Brownian motion on the unitary group at equilibrium, then the measures |(U_t - e^i θ)|^γ dt dθ converge in the limit of large dimension to the 2d LQG measure, a properly normalized exponential of the 2d Gaussian free field. Gaussian free field type fluctuations associated with these dynamics were first established by Spohn (1998) and convergence to the LQG measure in 2d settings was conjectured since the work of Webb (2014), who proved the convergence of related one dimensional measures by using inputs from Riemann-Hilbert theory. The convergence follows from the first multi-time extension of the result by Widom (1973) on Fisher-Hartwig asymptotics of Toeplitz determinants with real symbols. To prove these, we develop a general surgery argument and combine determinantal point processes estimates with stochastic analysis on Lie group, providing in passing a probabilistic proof of Webb's 1d result. We believe the techniques will be more broadly applicable to matrix dynamics out of equilibrium, joint moments of determinants for classes of correlated random matrices, and the characteristic polynomial of non-Hermitian random matrices.Paul Bourgade, Hugo Falconetwork_pdh6oy6y4nh5zjpdnevsa26e4aSat, 17 Sep 2022 00:00:00 GMTMultiplicative chaos measures from thick points of log-correlated fields
https://scholar.archive.org/work/a5jozmi7ozblfjvtwy7lkdcydq
We prove that multiplicative chaos measures can be constructed from extreme level sets or thick points of the underlying logarithmically correlated field. We develop a method which covers the whole subcritical phase and only requires asymptotics of suitable exponential moments for the field. As an application, we establish these estimates hold for the logarithm of the absolute value of the characteristic polynomial of a Haar distributed random unitary matrix (CUE), using known asymptotics for Toeplitz determinant with (merging) Fisher-Hartwig singularities. Hence, this proves a conjecture of Fyodorov and Keating concerning the fluctuations of the volume of thick points of the CUE characteristic polynomial.Janne Junnila, Gaultier Lambert, Christian Webbwork_a5jozmi7ozblfjvtwy7lkdcydqWed, 14 Sep 2022 00:00:00 GMTCausal Bandits for Linear Structural Equation Models
https://scholar.archive.org/work/epbi4pqkzfayzdyy7k6jjxlpma
This paper studies the problem of designing an optimal sequence of interventions in a causal graphical model to minimize the cumulative regret with respect to the best intervention in hindsight. This is, naturally, posed as a causal bandit problem. The focus is on causal bandits for linear structural equation models (SEMs) and soft interventions. It is assumed that the graph's structure is known, and it has N nodes. Two linear mechanisms, one soft intervention and one observational, are assumed for each node, giving rise to 2^N possible interventions. The existing causal bandit algorithms assume that at least the interventional distributions of the reward node's parents are fully specified. However, there are 2^N such distributions (one corresponding to each intervention), acquiring which becomes prohibitive even in moderate-sized graphs. This paper dispenses with the assumption of knowing these distributions. Two algorithms are proposed for the frequentist (UCB-based) and Bayesian (Thompson Sampling-based) settings. The key idea of these algorithms is to avoid directly estimating the 2^N reward distributions and instead estimate the parameters that fully specify the SEMs (linear in N) and use them to compute the rewards. In both algorithms, under boundedness assumptions on noise and the parameter space, the cumulative regrets scale as Õ ((2d)^L L √(T)), where d is the graph's maximum degree, and L is the length of its longest causal path.Burak Varici, Karthikeyan Shanmugam, Prasanna Sattigeri, Ali Tajerwork_epbi4pqkzfayzdyy7k6jjxlpmaMon, 05 Sep 2022 00:00:00 GMTMating of trees for critical Liouville quantum gravity
https://scholar.archive.org/work/bttltrjwffbstmn5r4udsecdfi
In a groundbreaking work, Duplantier, Miller and Sheffield showed that subcritical Liouville quantum gravity (LQG) coupled with Schramm-Loewner evolutions (SLE) can be described by the mating of two continuum random trees. In this paper, we consider the counterpart of their result for critical LQG and SLE, i.e., for the case when γ^2=κ=16/κ=4. We prove that as one sends κ↓ 4 in the subcritical setting, the space-filling SLE_κ in a disk degenerates to the CLE_4 exploration introduced by Werner and Wu, along with a collection of i.i.d. coin tosses indexed by the branch points of the exploration. Furthermore, in the κ=16/γ^2↓ 4 limit, the pair of continuum random trees collapse into a single continuum random tree, and we observe that upon applying an appropriate affine transform to the encoding Brownian motions before taking the limit, we get convergence to a pair of independent Brownian motions (A,B). The Brownian motion A encodes the LQG distance from the CLE loops to the boundary of the disk, while the Brownian motion B encodes the boundary lengths of the CLE_4 loops. In contrast to the subcritical setting, (A,B) does not determine the CLE-decorated LQG surface.Juhan Aru, Nina Holden, Ellen Powell, Xin Sunwork_bttltrjwffbstmn5r4udsecdfiWed, 31 Aug 2022 00:00:00 GMTA Consistent and Robust Test for Autocorrelated Jump Occurrences
https://scholar.archive.org/work/sn3r47mxqfhqtokfiaehjuqofe
We develop a nonparametric test for the temporal dependence of jump occurrences in the population. The test is consistent against all pairwise serial dependence, and is robust to the jump activity level and the choice of sampling scheme. We establish asymptotic normality and local power property for a rich set of local alternatives, including both self-exciting and/or self-inhibitory jumps. Simulation study confirms the robustness of the test and reveals its competitive size and power performance over existing tests. In an empirical study on high-frequency stock returns, our procedure uncovers a wide array of autocorrelation profiles of jump occurrences for different stocks in different time periods.Simon Kwokwork_sn3r47mxqfhqtokfiaehjuqofeMon, 29 Aug 2022 00:00:00 GMTOn the theory and practice of tensor recovery for high-dimensional partial differential equations
https://scholar.archive.org/work/6zfk5mi7ondg3kdmhbr23f4e3m
This thesis considers the problem of approximating low-rank tensors from data and its use for the non-intrusive solution of high-dimensional parametric partial differential equations (PDEs) and stochastic differential equations (SDEs). High-dimensional here refers to the large number of variables on which the solution depends. The looming curse of dimensionality, i.e. the exponential scaling of the number of parameters with respect to the number of variables, that is immanent to all generic, linear approximations, is evaded by applying hierarchical tensor formats, in particular tensor-trains, to represent the sought functions. As a non-intrusive method to attain such representations, regression is considered and the required high-dimensional integrals in the error functional are estimated by (quasi) Monte Carlo methods. The first part of this thesis analyzes the convergence of this empirical best approximation method and introduces a novel algorithm to find surprisingly good approximations even when the number of samples is low. The second part of this thesis considers the application of hierarchical tensor formats to practical problems and demonstrates the effectiveness of this approach on selected examples.Philipp Trunschke, Technische Universität Berlin, Reinhold Schneiderwork_6zfk5mi7ondg3kdmhbr23f4e3mMon, 29 Aug 2022 00:00:00 GMTInformation Design in Concave Games
https://scholar.archive.org/work/mmvtq65isjgnnnulqm5w3hmkve
We study information design in games with a continuum of actions such that the players' payoffs are concave in their own actions. A designer chooses an information structure -- a joint distribution of a state and a private signal of each player -- and evaluates it according to the designer's expected payoff under the equilibrium play in the induced Bayesian game. We show an information structure is designer optimal whenever it induces the equilibrium play that can be implemented by an incentive contract in an auxiliary principal-agent problem with a single agent who observes the state and controls all actions. We use this result to characterize optimal information structures in a variety of settings, including price competition, first-order Bayesian persuasion, and venture capital fundraising. If the state is normally distributed and the payoffs are quadratic, then in many cases Gaussian information structures are optimal. Fully informing a subset of players can also be optimal and robustly so, for all state distributions.Alex Smolin, Takuro Yamashitawork_mmvtq65isjgnnnulqm5w3hmkveSat, 20 Aug 2022 00:00:00 GMTBallistic macroscopic fluctuation theory
https://scholar.archive.org/work/j3crk4antvcnraf4ogoxhhgmp4
We introduce a new universal framework describing fluctuations and correlations in quantum and classical many-body systems, at the Euler hydrodynamic scale of space and time. The framework adapts the ideas of the conventional macroscopic fluctuation theory (MFT) to systems that support ballistic transport. The resulting "ballistic MFT" (BMFT) is solely based on the Euler hydrodynamics data of the many-body system. Within this framework, mesoscopic observables are classical random variables depending only on the fluctuating conserved densities, and Euler-scale fluctuations are obtained by deterministically transporting thermodynamic fluctuations via the Euler hydrodynamics. Using the BMFT, we show that long-range correlations in space generically develop over time from long-wavelength inhomogeneous initial states in interacting models. This result, which we verify by numerical calculations, challenges the long-held paradigm that at the Euler scale, fluid cells may be considered uncorrelated. We also show that the Gallavotti-Cohen fluctuation theorem for non-equilibrium ballistic transport follows purely from time-reversal invariance of the Euler hydrodynamics. We check the validity of the BMFT by applying it to integrable systems, and in particular the hard-rod gas, with extensive simulations that confirm our analytical results.Benjamin Doyon, Gabriele Perfetto, Tomohiro Sasamoto, Takato Yoshimurawork_j3crk4antvcnraf4ogoxhhgmp4Sat, 20 Aug 2022 00:00:00 GMTRiemannian Diffusion Models
https://scholar.archive.org/work/b45kad7svzfzfkpuyq7cwbranm
Diffusion models are recent state-of-the-art methods for image generation and likelihood estimation. In this work, we generalize continuous-time diffusion models to arbitrary Riemannian manifolds and derive a variational framework for likelihood estimation. Computationally, we propose new methods for computing the Riemannian divergence which is needed in the likelihood estimation. Moreover, in generalizing the Euclidean case, we prove that maximizing this variational lower-bound is equivalent to Riemannian score matching. Empirically, we demonstrate the expressive power of Riemannian diffusion models on a wide spectrum of smooth manifolds, such as spheres, tori, hyperboloids, and orthogonal groups. Our proposed method achieves new state-of-the-art likelihoods on all benchmarks.Chin-Wei Huang, Milad Aghajohari, Avishek Joey Bose, Prakash Panangaden, Aaron Courvillework_b45kad7svzfzfkpuyq7cwbranmTue, 16 Aug 2022 00:00:00 GMTMachine learning meets false discovery rate
https://scholar.archive.org/work/rahb2kjf7vcvxdoksvuoig72mu
Classical false discovery rate (FDR) controlling procedures offer strong and interpretable guarantees, while they often lack of flexibility. On the other hand, recent machine learning classification algorithms, as those based on random forests (RF) or neural networks (NN), have great practical performances but lack of interpretation and of theoretical guarantees. In this paper, we make these two meet by introducing a new adaptive novelty detection procedure with FDR control, called AdaDetect. It extends the scope of recent works of multiple testing literature to the high dimensional setting, notably the one in Yang et al. (2021). AdaDetect is shown to both control strongly the FDR and to have a power that mimics the one of the oracle in a specific sense. The interest and validity of our approach is demonstrated with theoretical results, numerical experiments on several benchmark datasets and with an application to astrophysical data. In particular, while AdaDetect can be used in combination with any classifier, it is particularly efficient on real-world datasets with RF, and on images with NN.Ariane Marandon, Lihua Lei, David Mary, Etienne Roquainwork_rahb2kjf7vcvxdoksvuoig72muSat, 13 Aug 2022 00:00:00 GMTPoint Processes and Multiple SLE/GFF Coupling
https://scholar.archive.org/work/acezametwrhcjaete77gjoi6gy
In the series of lectures, we will discuss probability laws of random points, curves, and surfaces. Starting from a brief review of the notion of martingales, one-dimensional Brownian motion (BM), and the D-dimensional Bessel processes, BES_D, D ≥ 1, first we study Dyson's Brownian motion model with parameter β >0, DYS_β, which is regarded as multivariate extensions of BES_D with the relation β=D-1. Next, using the reproducing kernels of Hilbert function spaces, the Gaussian analytic functions (GAFs) are defined on a unit disk and an annulus. As zeros of the GAFs, determinantal point processes and permanental-determinantal point processes are obtained. Then, the Schramm–Loewner evolution with parameter κ >0, SLE_κ, is introduced, which is driven by a BM on ℝ and generates a family of conformally invariant probability laws of random curves on the upper half complex plane ℍ. We regard SLE_κ as a complexification of BES_D with the relation κ=4/(D-1). The last topic of lectures is the construction of the multiple SLE_κ, which is driven by the N-particle process on ℝ and generates N interacting random curves in ℍ. We prove that the multiple SLE/GFF coupling is established, if and only if the driving N-particle process on ℝ is identified with DYS_β with the relation β=8/κ.Makoto Katoriwork_acezametwrhcjaete77gjoi6gyFri, 12 Aug 2022 00:00:00 GMTIsotonic Distributional Regression
https://scholar.archive.org/work/5m2oaz7yojaghn6tlsrnsuq4hq
Distributional regression estimates the probability distribution of a response variable conditional on covariates. The estimated conditional distribution comprehensively summarizes the available information on the response variable, and allows to derive all statistical quantities of interest, such as the conditional mean, threshold exceedance probabilities, or quantiles. This thesis develops isotonic distributional regression, a method for estimating conditional distributions under the assumption of a monotone relationship between covariates and a response variable. The response variable is univariate and real-valued, and the covariates lie in a partially ordered set. The monotone relationship is formulated in terms of stochastic order constraints, that is, the response variable increases in a stochastic sense as the covariates increase in the partial order. This assumption alone yields a shape-constrained non-parametric estimator, which does not involve any tuning parameters. The estimation of distributions under stochastic order restrictions has already been studied for various stochastic orders, but so far only with totally ordered covariates. Apart from considering more general partially ordered covariates, the first main contribution of this thesis lies in a shift of focus from estimation to prediction. Distributional regression is the backbone of probabilistic forecasting, which aims at quantifying the uncertainty about a future quantity of interest comprehensively in the form of probability distributions. When analyzed with respect to predominant criteria for probabilistic forecast quality, isotonic distributional regression is shown to have desirable properties. In addition, this thesis develops an efficient algorithm for the computation of isotonic distributional regression, and proposes an estimator under a weaker, previously not thoroughly studied stochastic order constraint. A main application of isotonic distributional regression is the uncertainty quantification for point forecasts. Such po [...]Alexander Henziwork_5m2oaz7yojaghn6tlsrnsuq4hqFri, 12 Aug 2022 00:00:00 GMTPredictive Quantile Regression with Mixed Roots and Increasing Dimensions: ALQR Approach
https://scholar.archive.org/work/6ainid3jbrdz3hcdl5iiuxyl6m
In this paper we propose the adaptive lasso for predictive quantile regression (ALQR). Reflecting empirical findings, we allow predictors to have various degrees of persistence and exhibit different signal strengths. The number of predictors is allowed to grow with the sample size. We study regularity conditions under which stationary, local unit root, and cointegrated predictors are present simultaneously. We next show the convergence rates, model selection consistency, and asymptotic distributions of ALQR. We apply the proposed method to the out-of-sample quantile prediction problem of stock returns and find that it outperforms the existing alternatives. We also provide numerical evidence from additional Monte Carlo experiments, supporting the theoretical results.Rui Fan, Ji Hyung Lee, Youngki Shinwork_6ainid3jbrdz3hcdl5iiuxyl6mWed, 10 Aug 2022 00:00:00 GMT