<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1405-5546</journal-id>
<journal-title><![CDATA[Computación y Sistemas]]></journal-title>
<abbrev-journal-title><![CDATA[Comp. y Sist.]]></abbrev-journal-title>
<issn>1405-5546</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Investigación en Computación]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1405-55462020000200523</article-id>
<article-id pub-id-type="doi">10.13053/cys-24-2-3376</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[Offensive Language Recognition in Social Media]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Shushkevich]]></surname>
<given-names><![CDATA[Elena]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Cardiff]]></surname>
<given-names><![CDATA[John]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Rosso]]></surname>
<given-names><![CDATA[Paolo]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Akhtyamova]]></surname>
<given-names><![CDATA[Liliya]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,Technological University Dublin  ]]></institution>
<addr-line><![CDATA[ ]]></addr-line>
<country>Ireland</country>
</aff>
<aff id="Af2">
<institution><![CDATA[,Universitat Politècnica de València  ]]></institution>
<addr-line><![CDATA[ ]]></addr-line>
<country>Spain</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>06</month>
<year>2020</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>06</month>
<year>2020</year>
</pub-date>
<volume>24</volume>
<numero>2</numero>
<fpage>523</fpage>
<lpage>532</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1405-55462020000200523&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1405-55462020000200523&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1405-55462020000200523&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[Abstract: This article proposes an approach to solving the problem of multiclassification within the framework of aggressive language recognition in Twitter. At the stage of preprocessing external data is added to the existing dataset, which is based on information in the links in dataset. This made it possible to expand the training dataset and thereby to improve the quality of the classification. The model created is an ensemble of classical machine learning models included Logistic Regression, Support Vector Machines, Naive Bayes models and a combination of Logistic Regression and Naive Bayes. The obtained value of macro F1-score for one of the experiments achieved 0.61, which exceeds the state-of-art published value by 1 percentage point. This indicates the potential value of the proposed approach in the field of hate speech recognition in social media.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[Hate speech]]></kwd>
<kwd lng="en"><![CDATA[ensemble of models]]></kwd>
<kwd lng="en"><![CDATA[logistic regression]]></kwd>
<kwd lng="en"><![CDATA[support vector machine]]></kwd>
<kwd lng="en"><![CDATA[naive Bayes]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<label>1</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Clarke]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
<name>
<surname><![CDATA[Grieve]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[Dimensions of abusive language on Twitter]]></source>
<year>2017</year>
<conf-name><![CDATA[ First Workshop on Abusive Language Online, Association for Computational Linguistics]]></conf-name>
<conf-loc> </conf-loc>
<page-range>1-10</page-range></nlm-citation>
</ref>
<ref id="B2">
<label>2</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Fasoli]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Carnaghi]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Paladino]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Social acceptability of sexist derogatory and sexist objectifying slurs across contexts]]></article-title>
<source><![CDATA[Language Sciences]]></source>
<year>2015</year>
<volume>52</volume>
<page-range>98-107</page-range></nlm-citation>
</ref>
<ref id="B3">
<label>3</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Fersini]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Anzovino]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Rosso]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<source><![CDATA[Overview of the task on automatic misogyny identification at Ibereval]]></source>
<year>2018</year>
<conf-name><![CDATA[ Third Workshop on Evaluation of Human Language Technologies for Iberian Languages]]></conf-name>
<conf-loc> </conf-loc>
<page-range>214-28</page-range></nlm-citation>
</ref>
<ref id="B4">
<label>4</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Fersini]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Nozza]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Rosso]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<source><![CDATA[Overview of the Evalita 2018 task on automatic misogyny identification (AMI)]]></source>
<year>2018</year>
<numero>4497</numero>
<conf-name><![CDATA[ 6th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian]]></conf-name>
<conf-loc> </conf-loc>
<issue>4497</issue>
<page-range>59-66</page-range><publisher-name><![CDATA[Academia]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B5">
<label>5</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Frenda]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Ghanem]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Montes-y-Gómez]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Exploration of misogyny in Spanish and English tweets]]></source>
<year>2018</year>
<volume>2150</volume>
<conf-name><![CDATA[ CEUR Workshop Proceedings]]></conf-name>
<conf-loc> </conf-loc>
<page-range>260-7</page-range></nlm-citation>
</ref>
<ref id="B6">
<label>6</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Genkin]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Lewis]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Madigan]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Large-scale Bayesian logistic regression for text categorization]]></article-title>
<source><![CDATA[Proceedings of the NAACL Student Research Workshop]]></source>
<year>2018</year>
<volume>49</volume>
<numero>3</numero>
<issue>3</issue>
<page-range>291-304</page-range></nlm-citation>
</ref>
<ref id="B7">
<label>7</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Joachims]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<source><![CDATA[Learning to classify text using support vector machines: Methods, theory and algorithms]]></source>
<year>2002</year>
<publisher-name><![CDATA[Kluwer Academic Publishers]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B8">
<label>8</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Park]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Fung]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<source><![CDATA[One-step and two-step classification for abusive language detection on Twitter]]></source>
<year>2017</year>
</nlm-citation>
</ref>
<ref id="B9">
<label>9</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Shushkevich]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Cardiff]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[Classifying misogynistic tweets using a blended model: the AMI shared task in Ibereval 2018]]></source>
<year>2018</year>
<conf-name><![CDATA[ CEUR Workshop Proceedings]]></conf-name>
<conf-loc> </conf-loc>
<page-range>255-9</page-range></nlm-citation>
</ref>
<ref id="B10">
<label>10</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Manning]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
</person-group>
<source><![CDATA[Baselines and bigrams: simple, good sentiment and topic classification]]></source>
<year>2012</year>
<volume>2</volume>
<conf-name><![CDATA[ 50th Annual Meeting of the Association for Computational Linguistics: Short Papers]]></conf-name>
<conf-loc> </conf-loc>
<page-range>90-4</page-range></nlm-citation>
</ref>
<ref id="B11">
<label>11</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Waseem]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Hovy]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Hateful symbols or hateful people? Predictive features for hate speech detection on twitter]]></article-title>
<source><![CDATA[Proceedings of the NAACL Student Research Workshop]]></source>
<year>2016</year>
<page-range>88-93</page-range></nlm-citation>
</ref>
<ref id="B12">
<label>12</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Wright]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<source><![CDATA[Logistic regression]]></source>
<year>1995</year>
<publisher-name><![CDATA[L.C. Grimm]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B13">
<label>13</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Li]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<source><![CDATA[Naive Bayes text classifier granular computing]]></source>
<year>2007</year>
<conf-name><![CDATA[ GRC´07 IEEE International Conference]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B14">
<label>14</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Luo]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Hate speech detection: A solved problem? The challenging case of long tail on twitter]]></article-title>
<source><![CDATA[Semantic Web]]></source>
<year>2018</year>
<volume>10</volume>
<numero>5</numero>
<issue>5</issue>
<page-range>925-45</page-range></nlm-citation>
</ref>
</ref-list>
</back>
</article>
