Knowledge Distillation of Japanese Morphological Analyzer
日本語形態素解析器の知識蒸留

Sora TAGAMI, Daisuke BEKKI
2021
In this study, we apply the method of knowledge distillation to the Japanese morphological analyzer rakkyo and evaluate if the method compresses its model size, and the training converges for smaller datasets. Recently, Japanese morphological analyzers have achieved high performance in both accuracy and speed. From the viewpoint of practical uses, however, it is preferable to reduce the model size. The rakkyo model, among others, succeeded in significantly reducing its model size by using only
more » ... haracter unigrams and discard the dictionary, by the training on silver data of 500 million sentences generated by Juman++. We tried to further compress rakkyo by constructing a neural morphological analyzer for Japanese using the outputs of rakkyo, namely the probabilistic distributions as training data. The evaluation is done against the silver data generated by rakkyo, which suggests that our model approaches the accuracy of rakkyo with a smaller amount of data.
doi:10.11517/pjsai.jsai2021.0_4j1gs6d02 fatcat:r3oy6oegsngf5kgd6v7x4z4vhi