SciELO - Scientific Electronic Library Online

 
vol.24 número2Knowledge Representation in TOEFL Expository TextsProposal for Named Entities Recognition and Classification (NERC) and the Automatic Generation of Rules on Mexican News índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Journal

Artigo

Indicadores

Links relacionados

  • Não possue artigos similaresSimilares em SciELO

Compartilhar


Computación y Sistemas

versão On-line ISSN 2007-9737versão impressa ISSN 1405-5546

Resumo

SHUSHKEVICH, Elena; CARDIFF, John; ROSSO, Paolo  e  AKHTYAMOVA, Liliya. Offensive Language Recognition in Social Media. Comp. y Sist. [online]. 2020, vol.24, n.2, pp.523-532.  Epub 04-Out-2021. ISSN 2007-9737.  https://doi.org/10.13053/cys-24-2-3376.

This article proposes an approach to solving the problem of multiclassification within the framework of aggressive language recognition in Twitter. At the stage of preprocessing external data is added to the existing dataset, which is based on information in the links in dataset. This made it possible to expand the training dataset and thereby to improve the quality of the classification. The model created is an ensemble of classical machine learning models included Logistic Regression, Support Vector Machines, Naive Bayes models and a combination of Logistic Regression and Naive Bayes. The obtained value of macro F1-score for one of the experiments achieved 0.61, which exceeds the state-of-art published value by 1 percentage point. This indicates the potential value of the proposed approach in the field of hate speech recognition in social media.

Palavras-chave : Hate speech; ensemble of models; logistic regression; support vector machine; naive Bayes.

        · texto em Inglês     · Inglês ( pdf )