Special issue: Optimization models and algorithms for data science
In the modern information age, data and decisions are more strongly linked than ever before. The recent emergence of 'data science' as an interdisciplinary field that aims to distil insights from data and that informs data-driven decision-making bears witness to the importance of rigorous, evidence-based decision-making that crosses the boundaries of statistics, computer science, machine learning and optimization. This special issue focuses on some of the recent data science-related advances in
... optimization theory. The contributions to this special issue include optimization models and algorithms that deal with large data sets, as well as applications from business analytics and machine learning that have emerged in the last decades as data accumulates from multiple sources. In the first paper (Max-Norm Optimization for Robust Matrix Recovery), Ethan X. Fang, Han Liu, Kim-Chuan Toh and Wen-Xin Zhou develop a new estimator for the matrix completion problem under arbitrary sampling schemes. The matrix completion problem aims to reconstruct an unknown matrix given information from a small number of noise-contaminated entries. Current methods assume that the indices of the observed entries follow a uniform distribution. This assumption is restrictive in many applications, including the well-known Netflix problem. To address this issue, the authors propose an estimator that incorporates both a max-norm and nuclear-norm regularization and solve the resulting model using ADMM (Alternating Direction Method of Multipliers). Through numerical experiments with real datasets, the authors show improvements over other estimators without a significant increase in computational effort. In the second paper (Primal and Dual Predicted Decrease Approximation Methods), Amir Beck, Edouard Pauwels and Shoham Sabach develop the notion of 'predicted decrease approximation' to provide a unified convergence analysis for various existing algorithms for constrained convex optimization. To this end, the authors consider the problem of minimizing smooth functions over compact convex sets using linear oracles. Methods based on linear oracles perform inexpensive computations at each iteration and are widely used in data science applications. The authors unify the convergence analysis of generalized conditional gradient, proximal gradient, greedy coordinate descent for separable constraints and working set methods for linear equality constraints with bounds. The analysis is performed using the concept of 'predicted decrease' that ensures that an algorithmic step is 'at least as good' as a conditional gradient step. The authors show that the dual application of this approach leads to primal-dual convergence guarantees that hold even if the primal model is only partially strongly convex.