SciELO - Scientific Electronic Library Online

vol.17 issue2Linguistically-driven Selection of Correct Arcs for Dependency Parsing author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand




Related links

  • Have no similar articlesSimilars in SciELO


Computación y Sistemas

Print version ISSN 1405-5546


ISLAM, Zahurul  and  MEHLER, Alexander. Automatic Readability Classification of Crowd-Sourced Data based on Linguistic and Information-Theoretic Features. Comp. y Sist. [online]. 2013, vol.17, n.2, pp.113-123. ISSN 1405-5546.

This paper presents a classifier of text readability based on information-theoretic features. The classifier was developed based on a linguistic approach to readability that explores lexical, syntactic and semantic features. For this evaluation we extracted a corpus of 645 articles from Wikipedia together with their quality judgments. We show that information-theoretic features perform as well as their linguistic counterparts even if we explore several linguistic levels at once.

Keywords : Text readability; Wikipedia; enthropy; information transmission; evaluation of features.

        · abstract in Spanish     · text in English     · English ( pdf )


Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License