ENTROPY-GUIDED FEATURE GENERATION FOR LARGE MARGIN STRUCTURED LEARNING
Monografias em Ciência da Computação
Structured learning consists in learning a mapping from inputs to structured outputs by means of a sample of correct input-output pairs. Many important problems fit in this setting. For instance, dependency parsing involves the recognition of a tree underlying a sentence. Feature generation is an important subtask of structured learning modeling. Usually, it is partially solved by a domain expert that builds complex and discriminative feature templates by conjoining the available basic
... This is a limited and expensive way to generate features and is recognized as a modeling bottleneck. In this work, we propose an automatic method to generate feature templates for structured learning algorithms. We denote this method entropy guided since it is based on the conditional entropy of local output variables given some basic features. We have evaluated our method on four computational linguistic tasks. We compare the proposed method with two important alternative feature generation methods, namely manual template generation and polynomial kernel functions. Our results show that entropy-guided feature generation outperforms both alternatives and, furthermore, presents additional advantages. The proposed method is cheaper than manual templates and much faster than kernel methods. Furthermore, the developed systems present state-of-the-art comparable performances and, particularly on Portuguese dependency parsing, remarkably reduces the previous smallest error by more than 15%. We further propose to model two complex natural language processing problems that, as far as we know, have never been approached by structured learning methods before. Namely, quotation extraction and coreference resolution.