An Exponential Tail bound for Lq Stable Learning Rules
International Conference on Algorithmic Learning Theory
There is an accumulating evidence in the literature that stability of learning algorithms is a key characteristic that permits a learning algorithm to generalize. Despite various insightful results in this direction, there seems to be an overlooked dichotomy in the type of stability-based generalization bounds we have in the literature. On one hand, the literature seems to suggest that exponential generalization bounds for the estimated risk, which are optimal, can be only obtained through
... gent, distribution independent and computationally intractable notions of stability such as uniform stability. On the other hand, it seems that weaker notions of stability such as hypothesis stability, although it is distribution dependent and more amenable to computation, can only yield polynomial generalization bounds for the estimated risk, which are suboptimal. In this paper, we address the gap between these two regimes of results. In particular, the main question we address here is whether it is possible to derive exponential generalization bounds for the estimated risk using a notion of stability that is computationally tractable and distribution dependent, but weaker than uniform stability. Using recent advances in concentration inequalities, and using a notion of stability that is weaker than uniform stability but distribution dependent and amenable to computation, we derive an exponential tail bound for the concentration of the estimated risk of a hypothesis returned by a general learning rule, where the estimated risk is expressed in terms of either the resubstitution estimate (empirical error), or the deleted (or, leave-one-out) estimate. As an illustration we derive exponential tail bounds for ridge regression with unbounded responses, where we show how stability changes with the tail behavior of the response variables.