A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is
Kelantan and Sarawak Malay Dialects: Parallel Dialect Text Collection and Alignment Using Hybrid Distance-Statistical-Based Phrase Alignment Algorithm
Turkish Journal of Computer and Mathematics Education
Parallel texts corpora are essential resources especially in translation and multilingual information retrieval. However, the publicly available parallel text corpora are limited to certain types and domains. Besides, Malay dialects are not standardized in term of writing. The existing alignment algorithms that is used to analayze the writing will require a large training data to obtain a good result. The paper describes our methodology in acquiring a parallel text corpus of Standard Malay anddoi:10.17762/turcomat.v12i3.1160 fatcat:dkhykvpwyjfm3ker6z6emmdlaq