Unsupervised induction of modern standard Arabic verb classes using syntactic frames and LSA

Neal Snider, Mona Diab
2006 Proceedings of the COLING/ACL on Main conference poster sessions -   unpublished
We exploit the resources in the Arabic Treebank (ATB) and Arabic Gigaword (AG) to determine the best features for the novel task of automatically creating lexical semantic verb classes for Modern Standard Arabic (MSA). The verbs are classified into groups that share semantic elements of meaning as they exhibit similar syntactic behavior. The results of the clustering experiments are compared with a gold standard set of classes, which is approximated by using the noisy English translations
more » ... ed in the ATB to create Levin-like classes for MSA. The quality of the clusters is found to be sensitive to the inclusion of syntactic frames, LSA vectors, morphological pattern, and subject animacy. The best set of parameters yields an F β=1 score of 0.456, compared to a random baseline of an F β=1 score of 0.205.
doi:10.3115/1273073.1273175 fatcat:oovu6qg62nbllc3lsyyg5iwiou