A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Ternary Decomposition and Dictionary Extension for Khmer Word Segmentation
2016
Journal of Information Technology Applications and Management
In this paper, we proposed a dictionary extension and a ternary decomposition technique to improve the effectiveness of Khmer word segmentation. Most word segmentation approaches depend on a dictionary. However, the dictionary being used is not fully reliable and cannot cover all the words of the Khmer language. This causes an issue of unknown words or out-of-vocabulary words. Our approach is to extend the original dictionary to be more reliable with new words. In addition, we use ternary
doi:10.21219/jitam.2016.23.2.011
fatcat:e6zsanmbwzbbfexp5rl5df3t2y