On feature extraction by mutual information maximization

Kari Torkkola
2002 IEEE International Conference on Acoustics Speech and Signal Processing  
We present a method for learning discriminative feature transforms using as criterion the mutual information between class labels and transformed features. Instead of a commonly used mutual information measure based on Kullback-Leibler divergence, we use a quadratic divergence measure, which allows us to make an efficient non-parametric implementation and requires no prior assumptions about class densities. In addition to linear transforms, we also discuss nonlinear transforms that are
more » ... ed as radial basis function networks. Extensions to reduce the computational complexity are also presented, and a comparison to greedy feature selection is made.
doi:10.1109/icassp.2002.5743865 dblp:conf/icassp/Torkkola02 fatcat:uzp6s7pznjbezkzyb2of2o7fvy