Exploring rich evidence for maximum entropy-based question answering [article]

Dan Shen, Universität Des Saarlandes, Universität Des Saarlandes
2008
Open domain automated Question Answering (QA) aims to automatically answer users' questions in spoken language. I propose a maximum entropy-based ranking model which effectively integrates various features, including orthographic, lexical, surface pattern, syntactic and semantic features for the answer extraction. To effectively capture syntactic evidence, I present two methods: dependency relation pattern methods and dependency relation path correlation method. Both methods overcome the
more » ... s arising from the divergences of lexical representations between question and answer sentences. I experimentally demonstrate that both methods greatly outperform the state-of-the-art syntactic answer extraction methods on TREC datasets. To capture semantic evidence, I propose an automatic method to incorporate FrameNet-style semantic role information. The graph-theoretic framework goes some way towards addressing coverage problems related with FrameNet and formulates the similarity measure of semantic structures as a graph matching problem. Experimental results show that the FrameNet-based semantic features may further boost the performance on the answer extraction module. Furthermore, I propose a maximum entropy-based ranking model to incorporate all captured information. As a result, the model using the optimal feature combination achieves top-ranked performance among all of the participants world-wide in the most recent TREC evaluation.
doi:10.22028/d291-22541 fatcat:gyec2e3yqfdlro3hwjoetmqok4