Independent Component Analysis [chapter]

Xudong Xie, Kin-Man Lam, Qionghai Dai, Xiaoming Peng, Jian Yang, Jingyu Yang, Xin Geng, Kate Smith-Miles, Seungjin Choi, Sarat C. Dass, S. Pankanti, S. Prabhakar (+44 others)
2009 Encyclopedia of Biometrics
A. Hyvärinen, J. Karhunen, and E. Oja, Independent Component Analysis, 2001. A. Cichocki and S. Amari, Adaptive Blind Signal and Image Processing, 2002. 4 / 78 Theory and Preliminaries for ICA Algorithms for ICA Beyond ICA Applications of ICA Model Theory ICALAB Toolbox ICALAB is a Matlab toolbox, containing various ICA algorithms. Check out http://www.bsp.brain.riken.jp/ICALAB 5 / 78 Theory and Preliminaries for ICA Algorithms for ICA Beyond ICA Applications of ICA Model Theory What is ICA?
more » ... ory What is ICA? ICA is a statistical method, the goal of which is to decompose the multivariate data x ∈ R n into a linear sum of statistically independent components, i.e., x = s 1 a 1 + s 2 a 2 + · · · + s n a n = As, where {s i } are coefficients (sources, latent variables, encoding variables) and {a i } are basis vectors. Constraints : Coefficients {s i } are statistically independent. Goal: Learn basis vectors A from data samples only {x(1), . . . , x(N)} 6 / 78 Theory and Preliminaries for ICA Algorithms for ICA Beyond ICA Applications of ICA Model Theory ICA vs. PCA Linear transform Compression (dimensionality reduction) Classification (feature extraction) PCA Second-order statistics (Gaussian) Linear orthogonal transform Optimal coding in MS sense ICA Higher-order statistics (non-Gaussian) Linear non-orthogonal transform Related with projection pursuit (non-Gaussian is interesting) Better features for classification? 7 / 78 Theory and Preliminaries for ICA Algorithms for ICA Beyond ICA Applications of ICA Model Theory Conventional gradient involves the following first-order approximation and searches for a direction that minimizes J (W + E) under a norm constraint on E = const. Relative gradient involves the follwing first-order approximation This leads to ∇ r J = ∇J W . / 78 Theory and Preliminaries for ICA Algorithms for ICA Beyond ICA Applications of ICA Criteria Unsupervised Learning Algorithms Algebraic Algorithms Natural Gradient Let S w = {w ∈ R n } be a parameter space on which an objective function J (w) is defined. If the coordinate system is nonorthogonal, then g ij (w) is Riemannian metric. Theorem The steepest descent direction of J (w) in a Riemannian space is given by / 78 Theory and Preliminaries for ICA Algorithms for ICA Beyond ICA Applications of ICA Criteria Unsupervised Learning Algorithms Algebraic Algorithms Natural Gradient ICA It turned out that the natural gradient in the context of ICA has the form ∇ ng J (W) = ∇J (W)W W. The natural gradient ICA algorithm is of the form W (t+1) = W (t) + η I − ϕ(y(t))y (t) W (t)