Filters








585 Hits in 1.4 sec

The FDR-Linking Theorem [article]

Weijie J. Su
<span title="2018-12-21">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
This paper introduces the FDR-linking theorem, a novel technique for understanding non-asymptotic FDR control of the Benjamini--Hochberg (BH) procedure under arbitrary dependence of the p-values. This theorem offers a principled and flexible approach to linking all p-values and the null p-values from the FDR control perspective, suggesting a profound implication that, to a large extent, the FDR of the BH procedure relies mostly on the null p-values. To illustrate the use of this theorem, we
more &raquo; ... ose a new type of dependence only concerning the null p-values, which, while strictly relaxing the state-of-the-art PRDS dependence (Benjamini and Yekutieli, 2001), ensures the FDR of the BH procedure below a level that is independent of the number of hypotheses. This level is, furthermore, shown to be optimal under this new dependence structure. Next, we present a concept referred to as FDR consistency that is weaker but more amenable than FDR control, and the FDR-linking theorem shows that FDR consistency is completely determined by the joint distribution of the null p-values, thereby reducing the analysis of this new concept to the global null case. Finally, this theorem is used to obtain a sharp FDR bound under arbitrary dependence, which improves the -correction FDR bound (Benjamini and Yekutieli, 2001) in certain regimes.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1812.08965v1">arXiv:1812.08965v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/xkyuemxlszelbmd6zfetiv45v4">fatcat:xkyuemxlszelbmd6zfetiv45v4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200915220727/https://arxiv.org/pdf/1812.08965v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/51/8a/518aff9ecab02fb911cd09eef52a43a22886761e.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1812.08965v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Federated f-Differential Privacy [article]

Qinqing Zheng, Shuxiao Chen, Qi Long, Weijie J. Su
<span title="2021-02-22">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Federated learning (FL) is a training paradigm where the clients collaboratively learn models by repeatedly sharing information without compromising much on the privacy of their local sensitive data. In this paper, we introduce federated f-differential privacy, a new notion specifically tailored to the federated setting, based on the framework of Gaussian differential privacy. Federated f-differential privacy operates on record level: it provides the privacy guarantee on each individual record
more &raquo; ... f one client's data against adversaries. We then propose a generic private federated learning framework PriFedSync that accommodates a large family of state-of-the-art FL algorithms, which provably achieves federated f-differential privacy. Finally, we empirically demonstrate the trade-off between privacy guarantee and prediction performance for models trained by PriFedSync in computer vision tasks.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2102.11158v1">arXiv:2102.11158v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/zbnhfqulpjck7bywdko4ntkvni">fatcat:zbnhfqulpjck7bywdko4ntkvni</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210224234308/https://arxiv.org/pdf/2102.11158v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/39/9b/399bad0f5cd6252c6195fc1a1f035d02d47baff8.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2102.11158v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Private False Discovery Rate Control [article]

Cynthia Dwork and Weijie Su and Li Zhang
<span title="2015-11-12">2015</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We provide the first differentially private algorithms for controlling the false discovery rate (FDR) in multiple hypothesis testing, with essentially no loss in power under certain conditions. Our general approach is to adapt a well-known variant of the Benjamini-Hochberg procedure (BHq), making each step differentially private. This destroys the classical proof of FDR control. To prove FDR control of our method, (a) we develop a new proof of the original (non-private) BHq algorithm and its
more &raquo; ... ust variants -- a proof requiring only the assumption that the true null test statistics are independent, allowing for arbitrary correlations between the true nulls and false nulls. This assumption is fairly weak compared to those previously shown in the vast literature on this topic, and explains in part the empirical robustness of BHq. Then (b) we relate the FDR control properties of the differentially private version to the control properties of the non-private version. We also present a low-distortion "one-shot" differentially private primitive for "top k" problems, e.g., "Which are the k most popular hobbies?" (which we apply to: "Which hypotheses have the k most significant p-values?"), and use it to get a faster privacy-preserving instantiation of our general approach at little cost in accuracy. The proof of privacy for the one-shot top k algorithm introduces a new technique of independent interest.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1511.03803v1">arXiv:1511.03803v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/fjn53cf7vzeu3m5nhsinybkghu">fatcat:fjn53cf7vzeu3m5nhsinybkghu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200904123437/https://arxiv.org/pdf/1511.03803v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/2e/14/2e14d8ced413b04052cc756a8a65b5e4f4db02f3.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1511.03803v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

On Learning Rates and Schrödinger Operators [article]

Bin Shi, Weijie J. Su, Michael I. Jordan
<span title="2020-04-15">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
The learning rate is perhaps the single most important parameter in the training of neural networks and, more broadly, in stochastic (nonconvex) optimization. Accordingly, there are numerous effective, but poorly understood, techniques for tuning the learning rate, including learning rate decay, which starts with a large initial learning rate that is gradually decreased. In this paper, we present a general theoretical analysis of the effect of the learning rate in stochastic gradient descent
more &raquo; ... D). Our analysis is based on the use of a learning-rate-dependent stochastic differential equation (lr-dependent SDE) that serves as a surrogate for SGD. For a broad class of objective functions, we establish a linear rate of convergence for this continuous-time formulation of SGD, highlighting the fundamental importance of the learning rate in SGD, and contrasting to gradient descent and stochastic gradient Langevin dynamics. Moreover, we obtain an explicit expression for the optimal linear rate by analyzing the spectrum of the Witten-Laplacian, a special case of the Schr\"odinger operator associated with the lr-dependent SDE. Strikingly, this expression clearly reveals the dependence of the linear convergence rate on the learning rate -- the linear rate decreases rapidly to zero as the learning rate tends to zero for a broad class of nonconvex functions, whereas it stays constant for strongly convex functions. Based on this sharp distinction between nonconvex and convex problems, we provide a mathematical interpretation of the benefits of using learning rate decay for nonconvex optimization.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2004.06977v1">arXiv:2004.06977v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/pervsnx4cvfcxcbe3dai6tmsfm">fatcat:pervsnx4cvfcxcbe3dai6tmsfm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200417222056/https://arxiv.org/pdf/2004.06977v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2004.06977v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Rejoinder: Gaussian Differential Privacy [article]

Jinshuo Dong, Aaron Roth, Weijie J. Su
<span title="2021-06-26">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
In this rejoinder, we aim to address two broad issues that cover most comments made in the discussion. First, we discuss some theoretical aspects of our work and comment on how this work might impact the theoretical foundation of privacy-preserving data analysis. Taking a practical viewpoint, we next discuss how f-differential privacy (f-DP) and Gaussian differential privacy (GDP) can make a difference in a range of applications.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2104.01987v2">arXiv:2104.01987v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/qrwqsrtklnbsvcriibay3egrau">fatcat:qrwqsrtklnbsvcriibay3egrau</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210701021635/https://arxiv.org/pdf/2104.01987v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/10/59/1059c558288a2ad0ada8d79917ae54f52cf096d5.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2104.01987v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Robust Inference Under Heteroskedasticity via the Hadamard Estimator [article]

Edgar Dobriban, Weijie J. Su
<span title="2018-07-01">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Drawing statistical inferences from large datasets in a model-robust way is an important problem in statistics and data science. In this paper, we propose methods that are robust to large and unequal noise in different observational units (i.e., heteroskedasticity) for statistical inference in linear regression. We leverage the Hadamard estimator, which is unbiased for the variances of ordinary least-squares regression. This is in contrast to the popular White's sandwich estimator, which can be
more &raquo; ... substantially biased in high dimensions. We propose to estimate the signal strength, noise level, signal-to-noise ratio, and mean squared error via the Hadamard estimator. We develop a new degrees of freedom adjustment that gives more accurate confidence intervals than variants of White's sandwich estimator. Moreover, we provide conditions ensuring the estimator is well-defined, by studying a new random matrix ensemble in which the entries of a random orthogonal projection matrix are squared. We also show approximate normality, using the second-order Poincare inequality. Our work provides improved statistical theory and methods for linear regression in high dimensions.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1807.00347v1">arXiv:1807.00347v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/k3hxzpfcr5dmbiirwrf26rdm6i">fatcat:k3hxzpfcr5dmbiirwrf26rdm6i</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20191018125145/https://arxiv.org/pdf/1807.00347v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/d3/6d/d36d2fae48305e40afde24173226f19ddea82a4e.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1807.00347v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Gaussian Differential Privacy [article]

Jinshuo Dong, Aaron Roth, Weijie J. Su
<span title="2019-05-30">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Differential privacy has seen remarkable success as a rigorous and practical formalization of data privacy in the past decade. This privacy definition and its divergence based relaxations, however, have several acknowledged weaknesses, either in handling composition of private algorithms or in analyzing important primitives like privacy amplification by subsampling. Inspired by the hypothesis testing formulation of privacy, this paper proposes a new relaxation, which we term 'f-differential
more &raquo; ... acy' (f-DP). This notion of privacy has a number of appealing properties and, in particular, avoids difficulties associated with divergence based relaxations. First, f-DP preserves the hypothesis testing interpretation. In addition, f-DP allows for lossless reasoning about composition in an algebraic fashion. Moreover, we provide a powerful technique to import existing results proven for original DP to f-DP and, as an application, obtain a simple subsampling theorem for f-DP. In addition to the above findings, we introduce a canonical single-parameter family of privacy notions within the f-DP class that is referred to as 'Gaussian differential privacy' (GDP), defined based on testing two shifted Gaussians. GDP is focal among the f-DP class because of a central limit theorem we prove. More precisely, the privacy guarantees of any hypothesis testing based definition of privacy (including original DP) converges to GDP in the limit under composition. The CLT also yields a computationally inexpensive tool for analyzing the exact composition of private algorithms. Taken together, this collection of attractive properties render f-DP a mathematically coherent, analytically tractable, and versatile framework for private data analysis. Finally, we demonstrate the use of the tools we develop by giving an improved privacy analysis of noisy stochastic gradient descent.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1905.02383v3">arXiv:1905.02383v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ftaccm6hxfetfemzbhlbnlpza4">fatcat:ftaccm6hxfetfemzbhlbnlpza4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200923060923/https://arxiv.org/pdf/1905.02383v2.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/a2/9a/a29ac94e5bc7fa4a1b376bd70999f27bb3cdf05a.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1905.02383v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Benign Overfitting and Noisy Features [article]

Zhu Li, Weijie Su, Dino Sejdinovic
<span title="2021-02-04">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Modern machine learning often operates in the regime where the number of parameters is much higher than the number of data points, with zero training loss and yet good generalization, thereby contradicting the classical bias-variance trade-off. This benign overfitting phenomenon has recently been characterized using so called double descent curves where the risk undergoes another descent (in addition to the classical U-shaped learning curve when the number of parameters is small) as we increase
more &raquo; ... the number of parameters beyond a certain threshold. In this paper, we examine the conditions under which Benign Overfitting occurs in the random feature (RF) models, i.e. in a two-layer neural network with fixed first layer weights. We adopt a new view of random feature and show that benign overfitting arises due to the noise which resides in such features (the noise may already be present in the data and propagate to the features or it may be added by the user to the features directly) and plays an important implicit regularization role in the phenomenon.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2008.02901v2">arXiv:2008.02901v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/vduvfiw6anc77fcoxek46vtez4">fatcat:vduvfiw6anc77fcoxek46vtez4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210209094600/https://arxiv.org/pdf/2008.02901v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/01/f1/01f135161f0ce9189dd28f8e199937061a14c520.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2008.02901v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Group SLOPE - adaptive selection of groups of predictors [article]

Damian Brzyski and Weijie Su and Małgorzata Bogdan
<span title="2015-11-29">2015</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Sorted L-One Penalized Estimation is a relatively new convex optimization procedure which allows for adaptive selection of regressors under sparse high dimensional designs. Here we extend the idea of SLOPE to deal with the situation when one aims at selecting whole groups of explanatory variables instead of single regressors. This approach is particularly useful when variables in the same group are strongly correlated and thus true predictors are difficult to distinguish from their correlated
more &raquo; ... eighbors"'. We formulate the respective convex optimization problem, gSLOPE (group SLOPE), and propose an efficient algorithm for its solution. We also define a notion of the group false discovery rate (gFDR) and provide a choice of the sequence of tuning parameters for gSLOPE so that gFDR is provably controlled at a prespecified level if the groups of variables are orthogonal to each other. Moreover, we prove that the resulting procedure adapts to unknown sparsity and is asymptotically minimax with respect to the estimation of the proportions of variance of the response variable explained by regressors from different groups. We also provide a method for the choice of the regularizing sequence when variables in different groups are not orthogonal but statistically independent and illustrate its good properties with computer simulations.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1511.09078v1">arXiv:1511.09078v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ye5xt3n5mnbq5kaarafvrhkjki">fatcat:ye5xt3n5mnbq5kaarafvrhkjki</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200930172654/https://arxiv.org/pdf/1511.09078v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/2a/bc/2abc9a1e45a30ea307056e5808149e8b3e784737.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1511.09078v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Causal Inference Principles for Reasoning about Commonsense Causality [article]

Jiayao Zhang, Hongming Zhang, Dan Roth, Weijie J. Su
<span title="2022-01-31">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Commonsense causality reasoning (CCR) aims at identifying plausible causes and effects in natural language descriptions that are deemed reasonable by an average person. Although being of great academic and practical interest, this problem is still shadowed by the lack of a well-posed theoretical framework; existing work usually relies on deep language models wholeheartedly, and is potentially susceptible to confounding co-occurrences. Motivated by classical causal principles, we articulate the
more &raquo; ... entral question of CCR and draw parallels between human subjects in observational studies and natural languages to adopt CCR to the potential-outcomes framework, which is the first such attempt for commonsense tasks. We propose a novel framework, ROCK, to Reason O(A)bout Commonsense K(C)ausality, which utilizes temporal signals as incidental supervision, and balances confounding effects using temporal propensities that are analogous to propensity scores. The ROCK implementation is modular and zero-shot, and demonstrates good CCR capabilities on various datasets.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2202.00436v1">arXiv:2202.00436v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/oavft5weard2jndwxt5vo6aal4">fatcat:oavft5weard2jndwxt5vo6aal4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220203063757/https://arxiv.org/pdf/2202.00436v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/1c/d9/1cd9c695f8b9df5fdd17395af7c892ad1470148d.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2202.00436v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

The Local Elasticity of Neural Networks [article]

Hangfeng He, Weijie J. Su
<span title="2020-02-15">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
This paper presents a phenomenon in neural networks that we refer to as local elasticity. Roughly speaking, a classifier is said to be locally elastic if its prediction at a feature vector ' is not significantly perturbed, after the classifier is updated via stochastic gradient descent at a (labeled) feature vector that is dissimilar to ' in a certain sense. This phenomenon is shown to persist for neural networks with nonlinear activation functions through extensive simulations on real-life and
more &raquo; ... synthetic datasets, whereas this is not observed in linear classifiers. In addition, we offer a geometric interpretation of local elasticity using the neural tangent kernel . Building on top of local elasticity, we obtain pairwise similarity measures between feature vectors, which can be used for clustering in conjunction with K-means. The effectiveness of the clustering algorithm on the MNIST and CIFAR-10 datasets in turn corroborates the hypothesis of local elasticity of neural networks on real-life data. Finally, we discuss some implications of local elasticity to shed light on several intriguing aspects of deep neural networks.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1910.06943v2">arXiv:1910.06943v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/mnwprpqr4nh4jmg24rxsyftnja">fatcat:mnwprpqr4nh4jmg24rxsyftnja</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200321125448/https://arxiv.org/pdf/1910.06943v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1910.06943v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Whitening Sentence Representations for Better Semantics and Faster Retrieval [article]

Jianlin Su, Jiarun Cao, Weijie Liu, Yangyiwen Ou
<span title="2021-03-29">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Pre-training models such as BERT have achieved great success in many natural language processing tasks. However, how to obtain better sentence representation through these pre-training models is still worthy to exploit. Previous work has shown that the anisotropy problem is an critical bottleneck for BERT-based sentence representation which hinders the model to fully utilize the underlying semantic features. Therefore, some attempts of boosting the isotropy of sentence distribution, such as
more &raquo; ... -based model, have been applied to sentence representations and achieved some improvement. In this paper, we find that the whitening operation in traditional machine learning can similarly enhance the isotropy of sentence representations and achieve competitive results. Furthermore, the whitening technique is also capable of reducing the dimensionality of the sentence representation. Our experimental results show that it can not only achieve promising performance but also significantly reduce the storage cost and accelerate the model retrieval speed.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2103.15316v1">arXiv:2103.15316v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/7clgmdujyrftxir5ioxnmw44jq">fatcat:7clgmdujyrftxir5ioxnmw44jq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210331001821/https://arxiv.org/pdf/2103.15316v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/a2/fd/a2fd50aa4dff5e04ed8535d84550da8bff316208.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2103.15316v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

A Theorem of the Alternative for Personalized Federated Learning [article]

Shuxiao Chen, Qinqing Zheng, Qi Long, Weijie J. Su
<span title="2021-03-02">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
A widely recognized difficulty in federated learning arises from the statistical heterogeneity among clients: local datasets often come from different but not entirely unrelated distributions, and personalization is, therefore, necessary to achieve optimal results from each individual's perspective. In this paper, we show how the excess risks of personalized federated learning with a smooth, strongly convex loss depend on data heterogeneity from a minimax point of view. Our analysis reveals a
more &raquo; ... rprising theorem of the alternative for personalized federated learning: there exists a threshold such that (a) if a certain measure of data heterogeneity is below this threshold, the FedAvg algorithm [McMahan et al., 2017] is minimax optimal; (b) when the measure of heterogeneity is above this threshold, then doing pure local training (i.e., clients solve empirical risk minimization problems on their local datasets without any communication) is minimax optimal. As an implication, our results show that the presumably difficult (infinite-dimensional) problem of adapting to client-wise heterogeneity can be reduced to a simple binary decision problem of choosing between the two baseline algorithms. Our analysis relies on a new notion of algorithmic stability that takes into account the nature of federated learning.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2103.01901v1">arXiv:2103.01901v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/4jvgbwklerbobhkgwplrgcpu2q">fatcat:4jvgbwklerbobhkgwplrgcpu2q</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210304020732/https://arxiv.org/pdf/2103.01901v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/5e/8b/5e8b511102120ea831bfb16ca164dafc31e5e602.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2103.01901v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Assumption Lean Regression [article]

Richard Berk, Andreas Buja, Lawrence Brown, Edward George, Arun Kumar Kuchibhotla, Weijie J. Su, Linda Zhao
<span title="2018-06-26">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
It is well known that models used in conventional regression analysis are commonly misspecified. A standard response is little more than a shrug. Data analysts invoke Box's maxim that all models are wrong and then proceed as if the results are useful nevertheless. In this paper, we provide an alternative. Regression models are treated explicitly as approximations of a true response surface that can have a number of desirable statistical properties, including estimates that are asymptotically
more &raquo; ... iased. Valid statistical inference follows. We generalize the formulation to include regression functionals, which broadens substantially the range of potential applications. An empirical application is provided to illustrate the paper's key concepts.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1806.09014v2">arXiv:1806.09014v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/w73tsdueprh7vpwxsmetbdkfrq">fatcat:w73tsdueprh7vpwxsmetbdkfrq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200928083918/https://arxiv.org/pdf/1806.09014v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/74/d9/74d97a0b39767393202d59e9726cd1ac5b96e3a3.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1806.09014v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Oneshot Differentially Private Top-k Selection [article]

Gang Qiao, Weijie J. Su, Li Zhang
<span title="2021-06-23">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Being able to efficiently and accurately select the top-k elements with differential privacy is an integral component of various private data analysis tasks. In this paper, we present the oneshot Laplace mechanism, which generalizes the well-known Report Noisy Max mechanism to reporting noisy top-k elements. We show that the oneshot Laplace mechanism with a noise level of O(√(k)/) is approximately differentially private. Compared to the previous peeling approach of running Report Noisy Max k
more &raquo; ... es, the oneshot Laplace mechanism only adds noises and computes the top k elements once, hence much more efficient for large k. In addition, our proof of privacy relies on a novel coupling technique that bypasses the use of composition theorems. Finally, we present a novel application of efficient top-k selection in the classical problem of ranking from pairwise comparisons.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2105.08233v2">arXiv:2105.08233v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/oaqkkbiibrepldwnnnlqzttamm">fatcat:oaqkkbiibrepldwnnnlqzttamm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210625161602/https://arxiv.org/pdf/2105.08233v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/9c/45/9c45c6e9d8178637810c67162911449b3859d000.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2105.08233v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 585 results