Better Approximation and Faster Algorithm Using the Proximal Average

Yaoliang Yu
2013 Neural Information Processing Systems  
It is a common practice to approximate "complicated" functions with more friendly ones. In large-scale machine learning applications, nonsmooth losses/regularizers that entail great computational challenges are usually approximated by smooth functions. We re-examine this powerful methodology and point out a nonsmooth approximation which simply pretends the linearity of the proximal map. The new approximation is justified using a recent convex analysis toolproximal average, and yields a novel
more » ... ximal gradient algorithm that is strictly better than the one based on smoothing, without incurring any extra overhead. Numerical experiments conducted on two important applications, overlapping group lasso and graph-guided fused lasso, corroborate the theoretical claims.
dblp:conf/nips/Yu13a fatcat:6bvsfb2ohrbypho37djwiv3mxa