SciELO - Scientific Electronic Library Online

 
vol.22 issue4Dialectones: Finding Statistically Significant Dialectal Boundaries Using Twitter DataExperimental Research on Encoder-Decoder Architectures with Attention for Chatbots author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • Have no similar articlesSimilars in SciELO

Share


Computación y Sistemas

On-line version ISSN 2007-9737Print version ISSN 1405-5546

Abstract

MULKI, Hala; HADDAD, Hatem; ALI, Chedi Bechikh  and  BABAOğLU, Ismail. Tunisian Dialect Sentiment Analysis: A Natural Language Processing-based Approach. Comp. y Sist. [online]. 2018, vol.22, n.4, pp.1223-1232.  Epub Feb 10, 2021. ISSN 2007-9737.  https://doi.org/10.13053/cys-22-4-3009.

Social media platforms have been witnessing a significant increase in posts written in the Tunisian dialect since the uprising in Tunisia at the end of 2010. Most of the posted tweets or comments reflect the impressions of the Tunisian public towards social, economical and political major events. These opinions have been tracked, analyzed and evaluated through sentiment analysis systems. In the current study, we investigate the impact of several preprocessing techniques on sentiment analysis using two sentiment classification models: Supervised and lexicon-based. These models were trained on three Tunisian datasets of different sizes and multiple domains. Our results emphasize the positive impact of preprocessing phase on the evaluation measures of both sentiment classifiers as the baseline was significantly outperformed when stemming, emoji recognition and negation detection tasks were applied. Moreover, integrating named entities with these tasks enhanced the lexicon-based classification performance in all datasets and that of the supervised model in medium and small sized datasets.

Keywords : Tunisian sentiment analysis; text preprocessing; named entities.

        · text in English     · English ( pdf )