Evaluation and classification of syntax usage in determining short-text semantic similarity

Vuk Batanovic, Dragan Bojic
2014 Telfor Journal  
This paper outlines and categorizes ways of using syntactic information in a number of algorithms for determining the semantic similarity of short texts. We consider the use of word order information, part-of-speech tagging, parsing and semantic role labeling. We analyze and evaluate the effects of syntax usage on algorithm performance by utilizing the results of a paraphrase detection test on the Microsoft Research Paraphrase Corpus. We also propose a new classification of algorithms based on
more » ... heir applicability to languages with scarce natural language processing tools. Keywords -natural language processing, MSRPC, parsing, part-of-speech tagging, semantic role labeling, shorttext semantic similarity, syntax, word order.
doi:10.5937/telfor1401064b fatcat:wb7hds5lrve2lixne3c2pxuxgq