Services on Demand
Journal
Article
Indicators
- Cited by SciELO
- Access statistics
Related links
- Similars in SciELO
Share
Computación y Sistemas
On-line version ISSN 2007-9737Print version ISSN 1405-5546
Abstract
ISLAM, Zahurul and MEHLER, Alexander. Automatic Readability Classification of Crowd-Sourced Data based on Linguistic and Information-Theoretic Features. Comp. y Sist. [online]. 2013, vol.17, n.2, pp.113-123. ISSN 2007-9737.
This paper presents a classifier of text readability based on information-theoretic features. The classifier was developed based on a linguistic approach to readability that explores lexical, syntactic and semantic features. For this evaluation we extracted a corpus of 645 articles from Wikipedia together with their quality judgments. We show that information-theoretic features perform as well as their linguistic counterparts even if we explore several linguistic levels at once.
Keywords : Text readability; Wikipedia; enthropy; information transmission; evaluation of features.