A stochastic language model using dependency and its improvement by word clustering

Shinsuke Mori, Makoto Nagao
1998 Proceedings of the 36th annual meeting on Association for Computational Linguistics -  
In this paper, we present a stochastic language model for Japanese using dependency. The prediction unit in this model is au attribute of"bunsetsu'. This is represented by the product of the head of content words and that of function words. TILe relation between the attributes of "bunsetsu" is ruled by a context-free grammar. The word sequences axe predicted from the attribute using word n-gram model. The spell of Unknow word is predicted using character n-grain model. This model is robust in
more » ... at it can compute the probability of art arbitrary string aild is complete in that it models from unknown word to dependency at tile saine time.
doi:10.3115/980691.980717 dblp:conf/acl/MoriN98 fatcat:hp7phmuymfez3oj5kgijdy7y3u