A statistical model for scientific readability

Luo Si, Jamie Callan
2001 Proceedings of the tenth international conference on Information and knowledge management - CIKM'01  
This paper presents a new method of using statistical models to estimate the reading difficulty of Web pages. Language Models are used to represent the content typically associated with different readability levels. Reading level classifiers are created as linear combinations of a language model and surface linguistic features. Experiments show that this new method is more accurate than the widely used Flesch-Kincaid readability formula KEYWORDS Readability, Flesch-Kincaid, Unigram Language
more » ... nigram Language Model, EM.
doi:10.1145/502692.502695 fatcat:aofevpx7urdzfblxmickgyakga