Assessing Vietnamese Text Readability using Multi-Level Linguistic Features

An-Vinh Luong, Diep Nguyen, Dien Dinh, Thuy Bui
2020 International Journal of Advanced Computer Science and Applications  
Text readability is the problem of determining whether a text is suitable for a certain group of readers, and thus building a model to assess the readability of text yields great significance across the disciplines of science, publishing, and education. While text readability has attracted attention since the late nineteenth century for English and other popular languages, it remains relatively underexplored in Vietnamese. Previous studies on this topic in Vietnamese have only focused on the
more » ... y focused on the examination of shallow word-level features using surface statistics such as frequency and ratio. Hence, features at higher levels like sentence structure and meaning are still untapped. In this study, we propose the most comprehensive analysis of Vietnamese text readability to date, targeting features at all linguistic levels, ranging from the lexical and phrasal elements to syntactic and semantic factors. This work pioneers the investigation on the effects of multi-level linguistic features on text readability in the Vietnamese language.
doi:10.14569/ijacsa.2020.0110814 fatcat:njgnqh73ijby7atmioy4fc5m4a