Simple or Complex? Assessing the readability of Basque Texts

Itziar Gonzalez-Dios, María Jesús Aranzabe, Arantza Díaz de Ilarraza Sánchez, Haritz Salaberri
2014 International Conference on Computational Linguistics  
In this paper we present a readability assessment system for Basque, ErreXail, which is going to be the preprocessing module of a Text Simplification system. To that end we compile two corpora, one of simple texts and another one of complex texts. To analyse those texts, we implement global, lexical, morphological, morpho-syntactic, syntactic and pragmatic features based on other languages and specially considered for Basque. We combine these feature types and we train our classifiers. After
more » ... ting the classifiers, we detect the features that perform best and the most predictive ones.
dblp:conf/coling/Gonzalez-DiosASS14 fatcat:dnyxrlk5zncg3hmv7e6s4ee3ne