Primal sparse Max-margin Markov networks

Jun Zhu, Eric P. Xing, Bo Zhang
2009 Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '09  
Max-margin Markov networks (M 3 N) have shown great promise in structured prediction and relational learning. Due to the KKT conditions, the M 3 N enjoys dual sparsity. However, the existing M 3 N formulation does not enjoy primal sparsity, which is a desirable property for selecting significant features and reducing the risk of over-fitting. In this paper, we present an 1-norm regularized max-margin Markov network ( 1-M 3 N), which enjoys dual and primal sparsity simultaneously. To learn an
more » ... 3 N, we present three methods including projected sub-gradient, cutting-plane, and a novel EM-style algorithm, which is based on an equivalence between 1-M 3 N and an adaptive M 3 N. We perform extensive empirical studies on both synthetic and real data sets. Our experimental results show that: (1) 1-M 3 N can effectively select significant features; (2) 1-M 3 N can perform as well as the pseudo-primal sparse Laplace M 3 N in prediction accuracy, while consistently outperforms other competing methods that enjoy either primal or dual sparsity; and (3) the EM-algorithm is more robust than the other two in prediction accuracy and time efficiency.
doi:10.1145/1557019.1557132 dblp:conf/kdd/ZhuXZ09 fatcat:4qj5ls34i5ahtbhokrt6mjlu5i