A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is `application/pdf`

.

## Filters

##
###
"Context-aware access control and presentation for linked data" by Luca Costabello with Prateek Jain as coordinator

2015
*
ACM SIGWEB Newsletter
*

##
###
Provable Inductive Matrix Completion
[article]

2013
*
arXiv
*
pre-print

Consider a movie recommendation system where apart from the ratings information, side information such as user's age or movie's genre is also available. Unlike standard matrix completion, in this setting one should be able to predict inductively on new users/movies. In this paper, we study the problem of inductive matrix completion in the exact recovery setting. That is, we assume that the ratings matrix is generated by applying feature vectors to a low-rank matrix and the goal is to recover

arXiv:1306.0626v1
fatcat:tv5ahzkrpfcerhi7tpa6cvj6cm
## more »

... al is to recover back the underlying matrix. Furthermore, we generalize the problem to that of low-rank matrix estimation using rank-1 measurements. We study this generic problem and provide conditions that the set of measurements should satisfy so that the alternating minimization method (which otherwise is a non-convex method with no convergence guarantees) is able to recover back the exact underlying low-rank matrix. In addition to inductive matrix completion, we show that two other low-rank estimation problems can be studied in our framework: a) general low-rank matrix sensing using rank-1 measurements, and b) multi-label regression with missing labels. For both the problems, we provide novel and interesting bounds on the number of measurements required by alternating minimization to provably converges to the exact low-rank matrix. In particular, our analysis for the general low rank matrix sensing problem significantly improves the required storage and computational cost than that required by the RIP-based matrix sensing methods RechtFP2007. Finally, we provide empirical validation of our approach and demonstrate that alternating minimization is able to recover the true matrix for the above mentioned problems using a small number of measurements.##
###
Memory Limited, Streaming PCA
[article]

2013
*
arXiv
*
pre-print

We consider streaming, one-pass principal component analysis (PCA), in the high-dimensional regime, with limited memory. Here, p-dimensional samples are presented sequentially, and the goal is to produce the k-dimensional subspace that best approximates these points. Standard algorithms require O(p^2) memory; meanwhile no algorithm can do better than O(kp) memory, since this is what the output itself requires. Memory (or storage) complexity is most meaningful when understood in the context of

arXiv:1307.0032v1
fatcat:dgmcwcgldre3pc4tggw6owdaka
## more »

... in the context of computational and sample complexity. Sample complexity for high-dimensional PCA is typically studied in the setting of the spiked covariance model, where p-dimensional points are generated from a population covariance equal to the identity (white noise) plus a low-dimensional perturbation (the spike) which is the signal to be recovered. It is now well-understood that the spike can be recovered when the number of samples, n, scales proportionally with the dimension, p. Yet, all algorithms that provably achieve this, have memory complexity O(p^2). Meanwhile, algorithms with memory-complexity O(kp) do not have provable bounds on sample complexity comparable to p. We present an algorithm that achieves both: it uses O(kp) memory (meaning storage of any kind) and is able to compute the k-dimensional spike with O(p p) sample-complexity -- the first algorithm of its kind. While our theoretical analysis focuses on the spiked covariance model, our simulations show that our algorithm is successful on much more general models for the data.##
###
Supervised Learning with Similarity Functions
[article]

2012
*
arXiv
*
pre-print

We address the problem of general supervised learning when data can only be accessed through an (indefinite) similarity function between data points. Existing work on learning with indefinite kernels has concentrated solely on binary/multi-class classification problems. We propose a model that is generic enough to handle any supervised learning task and also subsumes the model previously proposed for classification. We give a "goodness" criterion for similarity functions w.r.t. a given

arXiv:1210.5840v1
fatcat:3tibntrwfzac7nebgkxnwbqiju
## more »

... t. a given supervised learning task and then adapt a well-known landmarking technique to provide efficient algorithms for supervised learning using "good" similarity functions. We demonstrate the effectiveness of our model on three important super-vised learning problems: a) real-valued regression, b) ordinal regression and c) ranking where we show that our method guarantees bounded generalization error. Furthermore, for the case of real-valued regression, we give a natural goodness definition that, when used in conjunction with a recent result in sparse vector recovery, guarantees a sparse predictor with bounded generalization error. Finally, we report results of our learning algorithms on regression and ordinal regression tasks using non-PSD similarity functions and demonstrate the effectiveness of our algorithms, especially that of the sparse landmark selection algorithm that achieves significantly higher accuracies than the baseline methods while offering reduced computational costs.##
###
Universal Matrix Completion
[article]

2014
*
arXiv
*
pre-print

., 2010) , alternating minimization (

arXiv:1402.2324v2
fatcat:b36tn3g5onh3xotb4xrdpihgji
*Jain*et al., 2012) . ...##
###
Provable Tensor Factorization with Missing Data
[article]

2014
*
arXiv
*
pre-print

We study the problem of low-rank tensor factorization in the presence of missing data. We ask the following question: how many sampled entries do we need, to efficiently and exactly reconstruct a tensor with a low-rank orthogonal decomposition? We propose a novel alternating minimization based method which iteratively refines estimates of the singular vectors. We show that under certain standard assumptions, our method can recover a three-mode n× n× n dimensional rank-r tensor exactly from

arXiv:1406.2784v1
fatcat:3zevn62zp5g63bl6du25agbe4i
## more »

... r exactly from O(n^3/2 r^5 ^4 n) randomly sampled entries. In the process of proving this result, we solve two challenging sub-problems for tensors with missing data. First, in the process of analyzing the initialization step, we prove a generalization of a celebrated result by Szemerédie et al. on the spectrum of random graphs. Next, we prove global convergence of alternating minimization with a good initialization. Simulations suggest that the dependence of the sample size on dimensionality n is indeed tight.##
###
Orthogonal Matching Pursuit with Replacement
[article]

2011
*
arXiv
*
pre-print

In this paper, we consider the problem of compressed sensing where the goal is to recover almost all the sparse vectors using a small number of fixed linear measurements. For this problem, we propose a novel partial hard-thresholding operator that leads to a general family of iterative algorithms. While one extreme of the family yields well known hard thresholding algorithms like ITI (Iterative Thresholding with Inversion) and HTP (Hard Thresholding Pursuit), the other end of the spectrum leads

arXiv:1106.2774v1
fatcat:rwy4cihalnc6fbi57rtxhqhsau
## more »

... the spectrum leads to a novel algorithm that we call Orthogonal Matching Pursuit with Replacement (OMPR). OMPR, like the classic greedy algorithm OMP, adds exactly one coordinate to the support at each iteration, based on the correlation with the current residual. However, unlike OMP, OMPR also removes one coordinate from the support. This simple change allows us to prove that OMPR has the best known guarantees for sparse recovery in terms of the Restricted Isometry Property (a condition on the measurement matrix). In contrast, OMP is known to have very weak performance guarantees under RIP. Given its simple structure, we are able to extend OMPR using locality sensitive hashing to get OMPR-Hash, the first provably sub-linear (in dimensionality) algorithm for sparse recovery. Our proof techniques are novel and flexible enough to also permit the tightest known analysis of popular iterative algorithms such as CoSaMP and Subspace Pursuit. We provide experimental results on large problems providing recovery for vectors of size up to million dimensions. We demonstrate that for large-scale problems our proposed methods are more robust and faster than existing methods.##
###
Similarity-based Learning via Data Driven Embeddings
[article]

2011
*
arXiv
*
pre-print

We consider the problem of classification using similarity/distance functions over data. Specifically, we propose a framework for defining the goodness of a (dis)similarity function with respect to a given learning task and propose algorithms that have guaranteed generalization properties when working with such good functions. Our framework unifies and generalizes the frameworks proposed by [Balcan-Blum ICML 2006] and [Wang et al ICML 2007]. An attractive feature of our framework is its

arXiv:1112.5404v1
fatcat:74nhm5skdzgxdgfu6xx7fpeiw4
## more »

... work is its adaptability to data - we do not promote a fixed notion of goodness but rather let data dictate it. We show, by giving theoretical guarantees that the goodness criterion best suited to a problem can itself be learned which makes our approach applicable to a variety of domains and problems. We propose a landmarking-based approach to obtaining a classifier from such learned goodness criteria. We then provide a novel diversity based heuristic to perform task-driven selection of landmark points instead of random selection. We demonstrate the effectiveness of our goodness criteria learning method as well as the landmark selection heuristic on a variety of similarity-based learning datasets and benchmark UCI datasets on which our method consistently outperforms existing approaches by a significant margin.##
###
Robust Regression via Hard Thresholding
[article]

2015
*
arXiv
*
pre-print

We study the problem of Robust Least Squares Regression (RLSR) where several response variables can be adversarially corrupted. More specifically, for a data matrix X ∈ R^p x n and an underlying model w*, the response vector is generated as y = X'w* + b where b ∈ R^n is the corruption vector supported over at most C.n coordinates. Existing exact recovery results for RLSR focus solely on L1-penalty based convex formulations and impose relatively strict model assumptions such as requiring the

arXiv:1506.02428v1
fatcat:gn3zyiro5nbqtodko6xn2yjkaa
## more »

... s requiring the corruptions b to be selected independently of X. In this work, we study a simple hard-thresholding algorithm called TORRENT which, under mild conditions on X, can recover w* exactly even if b corrupts the response variables in an adversarial manner, i.e. both the support and entries of b are selected adversarially after observing X and w*. Our results hold under deterministic assumptions which are satisfied if X is sampled from any sub-Gaussian distribution. Finally unlike existing results that apply only to a fixed w*, generated independently of X, our results are universal and hold for any w* ∈ R^p. Next, we propose gradient descent-based extensions of TORRENT that can scale efficiently to large scale problems, such as high dimensional sparse recovery and prove similar recovery guarantees for these extensions. Empirically we find TORRENT, and more so its extensions, offering significantly faster recovery than the state-of-the-art L1 solvers. For instance, even on moderate-sized datasets (with p = 50K) with around 40 proposed method called TORRENT-HYB is more than 20x faster than the best L1 solver.##
###
Fast Exact Matrix Completion with Finite Samples
[article]

2014
*
arXiv
*
pre-print

Singular Value Projection Before we go on to prove Theorem 1, in this section we will analyze the basic SVP algorithm (Algorithm 1), bounding its sample complexity and thereby resolving a question posed by

arXiv:1411.1087v1
fatcat:doazc3pr5jbpdpuwqh5dafqejq
*Jain*...##
###
Basics of interpreting results

2015
*
International Dental & Medical Journal of Advanced Research - VOLUME 2015
*

The correct interpretation of research results is of paramount importance to know the eff ectiveness of the study. Researchers should describe the result clearly, and in a way that other researchers can compare them with their own results. For correct interpretation of results, sound knowledge of research methodology and statistics is needed. Results should be analyzed using appropriate statistical methods to try to determine the probability that they may have been by chance, and may not be

doi:10.15713/ins.idmjar.26
fatcat:hkhwtfjfxnc7flhjj7mudf2lga
## more »

... and may not be replicable in larger studies. Results need to be interpreted in an objective and critical way, before assessing their implications and drawing conclusions. The aim of the present review was to highlight the basic points to be kept in mind by the researcher while interpreting the results of a research paper.##
###
DROCC: Deep Robust One-Class Classification
[article]

2020
*
arXiv
*
pre-print

Correspondence to:

arXiv:2002.12718v2
fatcat:3vxztrzyrvfz3k7v7cjr2fsyqa
*Prateek**Jain*<pra-*jain*@microsoft.com>. Proceedings of the 37 th International Conference on Machine Learning, Online, PMLR 119, 2020. Copyright 2020 by the author(s). ...##
###
Locally Non-linear Embeddings for Extreme Multi-label Learning
[article]

2015
*
arXiv
*
pre-print

The objective in extreme multi-label learning is to train a classifier that can automatically tag a novel data point with the most relevant subset of labels from an extremely large label set. Embedding based approaches make training and prediction tractable by assuming that the training label matrix is low-rank and hence the effective number of labels can be reduced by projecting the high dimensional label vectors onto a low dimensional linear subspace. Still, leading embedding approaches have

arXiv:1507.02743v1
fatcat:akyvda6lqzhn3kufwyosfdrn5u
## more »

... ng approaches have been unable to deliver high prediction accuracies or scale to large problems as the low rank assumption is violated in most real world applications. This paper develops the X-One classifier to address both limitations. The main technical contribution in X-One is a formulation for learning a small ensemble of local distance preserving embeddings which can accurately predict infrequently occurring (tail) labels. This allows X-One to break free of the traditional low-rank assumption and boost classification accuracy by learning embeddings which preserve pairwise distances between only the nearest label vectors. We conducted extensive experiments on several real-world as well as benchmark data sets and compared our method against state-of-the-art methods for extreme multi-label classification. Experiments reveal that X-One can make significantly more accurate predictions then the state-of-the-art methods including both embeddings (by as much as 35%) as well as trees (by as much as 6%). X-One can also scale efficiently to data sets with a million labels which are beyond the pale of leading embedding methods.##
###
Thresholding based Efficient Outlier Robust PCA
[article]

2017
*
arXiv
*
pre-print

We consider the problem of outlier robust PCA (OR-PCA) where the goal is to recover principal directions despite the presence of outlier data points. That is, given a data matrix M^*, where (1-α) fraction of the points are noisy samples from a low-dimensional subspace while α fraction of the points can be arbitrary outliers, the goal is to recover the subspace accurately. Existing results for -PCA have serious drawbacks: while some results are quite weak in the presence of noise, other results

arXiv:1702.05571v1
fatcat:pvqdkpi2ezcvncuuw4w25di7r4
## more »

... ise, other results have runtime quadratic in dimension, rendering them impractical for large scale applications. In this work, we provide a novel thresholding based iterative algorithm with per-iteration complexity at most linear in the data size. Moreover, the fraction of outliers, α, that our method can handle is tight up to constants while providing nearly optimal computational complexity for a general noise setting. For the special case where the inliers are obtained from a low-dimensional subspace with additive Gaussian noise, we show that a modification of our thresholding based method leads to significant improvement in recovery error (of the subspace) even in the presence of a large fraction of outliers.##
###
Provable Submodular Minimization using Wolfe's Algorithm
[article]

2014
*
arXiv
*
pre-print

Owing to several applications in large scale learning and vision problems, fast submodular function minimization (SFM) has become a critical problem. Theoretically, unconstrained SFM can be performed in polynomial time [IFF 2001, IO 2009]. However, these algorithms are typically not practical. In 1976, Wolfe proposed an algorithm to find the minimum Euclidean norm point in a polytope, and in 1980, Fujishige showed how Wolfe's algorithm can be used for SFM. For general submodular functions, this

arXiv:1411.0095v1
fatcat:tasfwfaa6veklm2whrzsel5724
## more »

... lar functions, this Fujishige-Wolfe minimum norm algorithm seems to have the best empirical performance. Despite its good practical performance, very little is known about Wolfe's minimum norm algorithm theoretically. To our knowledge, the only result is an exponential time analysis due to Wolfe himself. In this paper we give a maiden convergence analysis of Wolfe's algorithm. We prove that in t iterations, Wolfe's algorithm returns an O(1/t)-approximate solution to the min-norm point on any polytope. We also prove a robust version of Fujishige's theorem which shows that an O(1/n^2)-approximate solution to the min-norm point on the base polytope implies exact submodular minimization. As a corollary, we get the first pseudo-polynomial time guarantee for the Fujishige-Wolfe minimum norm algorithm for unconstrained submodular function minimization.
« Previous

*Showing results 1 — 15 out of 554 results*