A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2016; you can also visit the original URL.
The file type is application/pdf
.
Data Fusion for Japanese Term and Character N-gram Search
2015
Proceedings of the 20th Australasian Document Computing Symposium on ZZZ - ADCS '15
Term segmentation plays a vital role in building effective information retrieval systems. In particular, languages such as Japanese and Chinese require a morphological analyzer or a word segmenter to identify potential terms. The alternative approach to indexing a segmented collection is n-gram search, where every n-length sequence of symbols is indexed. Both approaches have strengths and weaknesses when applied to non-English collections. In this study, we explore data fusion techniques to
doi:10.1145/2838931.2838939
dblp:conf/adcs/YasukawaCS15
fatcat:a3bdwzns2rh45brzojnzvoiysu