<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1405-5546</journal-id>
<journal-title><![CDATA[Computación y Sistemas]]></journal-title>
<abbrev-journal-title><![CDATA[Comp. y Sist.]]></abbrev-journal-title>
<issn>1405-5546</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Investigación en Computación]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1405-55462022000301293</article-id>
<article-id pub-id-type="doi">10.13053/cys-26-3-4350</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[What&#8217;s Your Style?Automatic Genre Identification with Neural Network]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Dömötör]]></surname>
<given-names><![CDATA[Andrea]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
<xref ref-type="aff" rid="Aaf"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Kákonyi]]></surname>
<given-names><![CDATA[Tibor]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Yang]]></surname>
<given-names><![CDATA[Zijian Gy&#337;z&#337;]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
<xref ref-type="aff" rid="Aaf"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,MTA-PPKE Hungarian Language Technology Research Group  ]]></institution>
<addr-line><![CDATA[ ]]></addr-line>
<country>Hungary</country>
</aff>
<aff id="Af2">
<institution><![CDATA[,Pázmány Péter Catholic University Faculty of Information Technology and Bionics ]]></institution>
<addr-line><![CDATA[ ]]></addr-line>
<country>Hungary</country>
</aff>
<aff id="Af3">
<institution><![CDATA[,Pázmány Péter Catholic University Faculty of Humanities and Social Studies ]]></institution>
<addr-line><![CDATA[ ]]></addr-line>
<country>Hungary</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>09</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>09</month>
<year>2022</year>
</pub-date>
<volume>26</volume>
<numero>3</numero>
<fpage>1293</fpage>
<lpage>1299</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1405-55462022000301293&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1405-55462022000301293&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1405-55462022000301293&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[Abstract: Genre identification is an important task in natural language processing that can be useful for many practical and research purposes. The challenge of this task is that genre is not a homogeneous and unequivocal property of the texts and it is often hard to separate from the topic. In this paper we compare the performance of two different automatic genre identification methods. We classified six text types: literary, academic, legal, press, spoken and personal. In one part of our research we did experiments with traditional machine learning methods using linguistic, n-gram and error features. In the other part we tested the same task with a word embedding based neural network. In this part we did experiments with different training data (words only, POS-tags only, words and POS-tags etc.). Our results revealed that neural network is a suitable method for this task while traditional machine learning showed significantly lower performance. We gained high (around 70%) accuracy with our word embedding based method. The results of the different text categories seemed to depend on the stylistic properties of the studied genres.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[Genre identification]]></kwd>
<kwd lng="en"><![CDATA[text classification]]></kwd>
<kwd lng="en"><![CDATA[machine learning]]></kwd>
<kwd lng="en"><![CDATA[neural networks]]></kwd>
<kwd lng="en"><![CDATA[word embedding]]></kwd>
<kwd lng="en"><![CDATA[stylistics]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bojanowski]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Grave]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Joulin]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Mikolov]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<source><![CDATA[Enriching word vectors with subword information]]></source>
<year>2016</year>
<publisher-name><![CDATA[CoRR]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B2">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Clark]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Ruthven]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
<name>
<surname><![CDATA[O&#8217;Brian Holt]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[The evolution of genre in wikipedia]]></article-title>
<source><![CDATA[Journal for Language Technology and Computational Linguistics]]></source>
<year>2009</year>
<volume>24</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>1-22</page-range></nlm-citation>
</ref>
<ref id="B3">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Endrédy]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
<name>
<surname><![CDATA[Prószéky]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A pázmány korpusz]]></article-title>
<source><![CDATA[Nyelvtudományi Közlemények]]></source>
<year>2016</year>
<numero>112</numero>
<issue>112</issue>
<page-range>191-206</page-range></nlm-citation>
</ref>
<ref id="B4">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Joulin]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Grave]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Bojanowski]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Mikolov]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<source><![CDATA[Bag of tricks for efficient text classification]]></source>
<year>2016</year>
<publisher-name><![CDATA[CoRR]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B5">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Lustrek]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Overview of Automatic Genre Identification]]></source>
<year>2007</year>
<publisher-loc><![CDATA[Ljubljana, Slovenia ]]></publisher-loc>
<publisher-name><![CDATA[Jo&#382;ef Stefan Institute, Department of Intelligent Systems]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B6">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[McCarthy]]></surname>
<given-names><![CDATA[P. M.]]></given-names>
</name>
<name>
<surname><![CDATA[Myers]]></surname>
<given-names><![CDATA[J. C.]]></given-names>
</name>
<name>
<surname><![CDATA[Briner]]></surname>
<given-names><![CDATA[S. W.]]></given-names>
</name>
<name>
<surname><![CDATA[Graesser]]></surname>
<given-names><![CDATA[A. C.]]></given-names>
</name>
<name>
<surname><![CDATA[McNamara]]></surname>
<given-names><![CDATA[D. S.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A psychological and computational study of sub-sentential genre recognition]]></article-title>
<source><![CDATA[Journal for Language Technology and Computational Linguistics]]></source>
<year>2009</year>
<volume>24</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>23-56</page-range></nlm-citation>
</ref>
<ref id="B7">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Novák]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Novák]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
</person-group>
<source><![CDATA[Magyar szóbeágyazási modellek kézi kiértékelése]]></source>
<year>2018</year>
<conf-name><![CDATA[ XIV. Magyar Számítógépes Nyelvészeti Konferencia (MSZNY 2018)]]></conf-name>
<conf-loc>Szeged, Hungary </conf-loc>
</nlm-citation>
</ref>
<ref id="B8">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Oravecz]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Váradi]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Sass]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Calzolari]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
</person-group>
<source><![CDATA[The Hungarian Gigaword Corpus]]></source>
<year>2014</year>
<conf-name><![CDATA[ 9th International Conference on Language Resources and Evaluation]]></conf-name>
<conf-loc>Reykjavik, Iceland </conf-loc>
</nlm-citation>
</ref>
<ref id="B9">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Petrenz]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Webber]]></surname>
<given-names><![CDATA[B. L.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Stable classification of text genres]]></article-title>
<source><![CDATA[Computational Linguistics]]></source>
<year>2011</year>
<volume>37</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>385-93</page-range></nlm-citation>
</ref>
<ref id="B10">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Prószéky]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
<name>
<surname><![CDATA[Tihanyi]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
</person-group>
<source><![CDATA[Humor &#8211; a morphological system for corpus analysis]]></source>
<year>1996</year>
<conf-name><![CDATA[ first TELRI seminar in Tihany]]></conf-name>
<conf-loc>Budapest, Hungary </conf-loc>
</nlm-citation>
</ref>
<ref id="B11">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Stolcke]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<source><![CDATA[Srilm &#8211; an extensible language modeling toolkit]]></source>
<year>2002</year>
<conf-name><![CDATA[ 7th international conference on spoken language processing]]></conf-name>
<conf-date>2002</conf-date>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B12">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Yang]]></surname>
<given-names><![CDATA[Z. G.]]></given-names>
</name>
<name>
<surname><![CDATA[Laki]]></surname>
<given-names><![CDATA[L. J.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Pirate: A task-oriented monolingual quality estimation system]]></article-title>
<source><![CDATA[Computational Linguistics and Applications]]></source>
<year>2017</year>
<volume>8</volume>
</nlm-citation>
</ref>
</ref-list>
</back>
</article>
