<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1405-5546</journal-id>
<journal-title><![CDATA[Computación y Sistemas]]></journal-title>
<abbrev-journal-title><![CDATA[Comp. y Sist.]]></abbrev-journal-title>
<issn>1405-5546</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Investigación en Computación]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1405-55462017000200167</article-id>
<article-id pub-id-type="doi">10.13053/cys-21-2-2732</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[Author Verification Using a Semantic Space Model]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Hernández-Castañeda]]></surname>
<given-names><![CDATA[Ángel]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
<xref ref-type="aff" rid="Aaf"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Calvo]]></surname>
<given-names><![CDATA[Hiram]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,Instituto Politécnico Nacional (IPN)  ]]></institution>
<addr-line><![CDATA[ ]]></addr-line>
<country>Mexico</country>
</aff>
<aff id="Af2">
<institution><![CDATA[,Tecnológico Nacional de México  ]]></institution>
<addr-line><![CDATA[ Estado de México]]></addr-line>
<country>Mexico</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>06</month>
<year>2017</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>06</month>
<year>2017</year>
</pub-date>
<volume>21</volume>
<numero>2</numero>
<fpage>167</fpage>
<lpage>179</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1405-55462017000200167&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1405-55462017000200167&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1405-55462017000200167&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[Abstract: In this work we propose to solve the author verification problem using a semantic space model through Latent Dirichlet Allocation (LDA). We experiment with the corpus used in the author identification tasks at PAN 2014 and PAN 2015. These datasets consist of subsets in the following languages: English, Spanish, Dutch and Greek. Each problem contained in these corpora is formed by one to five known documents which were written by one author and one unknown document. The task is to predict whether the unknown document was written by the author who wrote the known documents. We processed the documents in the dataset and captured the fingerprint of authors by generating a probabilistic distribution of words in the documents. In PAN 2015 classification, we achieved 81.6%, 75.4%, 74.1%, 67.1% accuracy for each English, Spanish, Dutch and Greek subset respectively. In particular for the English subset, we outreached the best result reported in both competitions.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[Author verification]]></kwd>
<kwd lng="en"><![CDATA[semantic space model]]></kwd>
<kwd lng="en"><![CDATA[cross-genre]]></kwd>
<kwd lng="en"><![CDATA[cross-topic]]></kwd>
<kwd lng="en"><![CDATA[latent Dirichlet allocation]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<label>1</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Afroz]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Brennan]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Greenstadt]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Detecting hoaxes, frauds, and deception in writing style online]]></article-title>
<source><![CDATA[IEEE Symposium on Security and Privacy]]></source>
<year>2012</year>
<page-range>461&#8211;475</page-range></nlm-citation>
</ref>
<ref id="B2">
<label>2</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bergsma]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Post]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Yarowsky]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<source><![CDATA[Stylometric analysis of scientific articles]]></source>
<year>2012</year>
<conf-name><![CDATA[ Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B3">
<label>3</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Blei]]></surname>
<given-names><![CDATA[D.M.]]></given-names>
</name>
<name>
<surname><![CDATA[Ng]]></surname>
<given-names><![CDATA[A.Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Jordan]]></surname>
<given-names><![CDATA[M.I.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Latent Dirichlet allocation]]></article-title>
<source><![CDATA[Journal of machine Learning research]]></source>
<year>2003</year>
<volume>3</volume>
<page-range>993&#8211;1022</page-range></nlm-citation>
</ref>
<ref id="B4">
<label>4</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bradley]]></surname>
<given-names><![CDATA[J.K.]]></given-names>
</name>
<name>
<surname><![CDATA[Kelley]]></surname>
<given-names><![CDATA[P.G.]]></given-names>
</name>
<name>
<surname><![CDATA[Roth]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<source><![CDATA[Author identification from citations]]></source>
<year>2008</year>
<publisher-loc><![CDATA[Pittsburgh, PA, USA ]]></publisher-loc>
<publisher-name><![CDATA[Dept. Computing Science, Carnegie Mellon Univ.]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B5">
<label>5</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Castro]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Lindauer]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
</person-group>
<source><![CDATA[Author Identification on Twitter]]></source>
<year>2012</year>
</nlm-citation>
</ref>
<ref id="B6">
<label>6</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Dumais]]></surname>
<given-names><![CDATA[S.T]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Latent semantic analysis]]></article-title>
<source><![CDATA[Annual review of information science and technology]]></source>
<year>2004</year>
<volume>38</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>188&#8211;230</page-range></nlm-citation>
</ref>
<ref id="B7">
<label>7</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Fawcett]]></surname>
<given-names><![CDATA[T]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[An introduction to ROC analysis]]></article-title>
<source><![CDATA[Pattern recognition letters]]></source>
<year>2006</year>
<volume>27</volume>
<numero>8</numero>
<issue>8</issue>
<page-range>861&#8211;874</page-range></nlm-citation>
</ref>
<ref id="B8">
<label>8</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Green]]></surname>
<given-names><![CDATA[R.M.]]></given-names>
</name>
<name>
<surname><![CDATA[Sheppard]]></surname>
<given-names><![CDATA[J.W.]]></given-names>
</name>
</person-group>
<source><![CDATA[Comparing Frequency-and Style-Based Features for Twitter Author Identification]]></source>
<year>2013</year>
<conf-name><![CDATA[ The Twenty-Sixth International FLAIRS Conference]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B9">
<label>9</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Layton]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Watters]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Dazeley]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Local n-grams for Author Identification]]></article-title>
<source><![CDATA[Notebook for PAN at CLEF]]></source>
<year>2013</year>
</nlm-citation>
</ref>
<ref id="B10">
<label>10</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Madigan]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Genkin]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Lewis]]></surname>
<given-names><![CDATA[D.D.]]></given-names>
</name>
<name>
<surname><![CDATA[Argamon]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Fradkin]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Ye]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Author identification on the large scale]]></article-title>
<source><![CDATA[Proceedings of the Meeting of the Classification Society of North America]]></source>
<year>2005</year>
<page-range>13</page-range></nlm-citation>
</ref>
<ref id="B11">
<label>11</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Moreau]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Jayapal]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Lynch]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
<name>
<surname><![CDATA[Vogel]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Author verification: basic stacked generalization applied to predictions from a set of heterogeneous learners]]></article-title>
<source><![CDATA[Working Notes Papers of the CLEF]]></source>
<year>2015</year>
</nlm-citation>
</ref>
<ref id="B12">
<label>12</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Narayanan]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Paskov]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Gong]]></surname>
<given-names><![CDATA[N.Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Bethencourt]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Stefanov]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Shin]]></surname>
<given-names><![CDATA[E.C.R.]]></given-names>
</name>
<name>
<surname><![CDATA[Song]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<source><![CDATA[On the feasibility of internet-scale author identification]]></source>
<year>2012</year>
<conf-name><![CDATA[ IEEE Symposium on Security and Privacy]]></conf-name>
<conf-loc> </conf-loc>
<page-range>300&#8211;314</page-range></nlm-citation>
</ref>
<ref id="B13">
<label>13</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Nirkhi]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Dharaskar]]></surname>
<given-names><![CDATA[R.V.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Comparative study of authorship identification techniques for cyber forensics analysis]]></article-title>
<source><![CDATA[ArXiv]]></source>
<year>2013</year>
</nlm-citation>
</ref>
<ref id="B14">
<label>14</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Pacheco]]></surname>
<given-names><![CDATA[M.L.]]></given-names>
</name>
<name>
<surname><![CDATA[Fernández]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Porco]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Random Forest with Increased Generalization: A Universal Background Approach for Authorship Verification]]></article-title>
<source><![CDATA[CLEF Working Notes]]></source>
<year>2015</year>
</nlm-citation>
</ref>
<ref id="B15">
<label>15</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Pateriya]]></surname>
<given-names><![CDATA[P.K]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A Study on Author Identification through Stylometry]]></article-title>
<source><![CDATA[International Journal of Computer Science &amp; Communication Networks]]></source>
<year>2012</year>
<volume>2</volume>
<numero>6</numero>
<issue>6</issue>
<page-range>653&#8211;657</page-range></nlm-citation>
</ref>
<ref id="B16">
<label>16</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Pavelec]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Justino]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Oliveira]]></surname>
<given-names><![CDATA[L.S.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Author identification using stylometric features]]></article-title>
<source><![CDATA[Revista Iberoamericana de Inteligencia Artificial]]></source>
<year>2007</year>
<volume>11</volume>
<numero>36</numero>
<issue>36</issue>
<page-range>59&#8211;66</page-range></nlm-citation>
</ref>
<ref id="B17">
<label>17</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Peñas]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Rodrigo]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<source><![CDATA[A simple measure to assess non-response]]></source>
<year>2011</year>
<conf-name><![CDATA[ Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B18">
<label>18</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Pimas]]></surname>
<given-names><![CDATA[O.]]></given-names>
</name>
<name>
<surname><![CDATA[Kröll]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Kern]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Know-Center at PAN author identification]]></article-title>
<source><![CDATA[Working Notes Papers of the CLEF]]></source>
<year>2015</year>
</nlm-citation>
</ref>
<ref id="B19">
<label>19</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Reynolds]]></surname>
<given-names><![CDATA[D.A.]]></given-names>
</name>
<name>
<surname><![CDATA[Quatieri]]></surname>
<given-names><![CDATA[T.F.]]></given-names>
</name>
<name>
<surname><![CDATA[Dunn]]></surname>
<given-names><![CDATA[R.B.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Speaker verification using adapted Gaussian mixture models]]></article-title>
<source><![CDATA[Digital signal processing]]></source>
<year>2000</year>
<volume>10</volume>
<numero>3</numero>
<issue>3</issue>
<page-range>19&#8211;41</page-range></nlm-citation>
</ref>
<ref id="B20">
<label>20</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Savoy]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Authorship attribution based on a probabilistic topic model]]></article-title>
<source><![CDATA[Information Processing &amp; Management]]></source>
<year>2013</year>
<volume>49</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>341&#8211;354</page-range></nlm-citation>
</ref>
<ref id="B21">
<label>21</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Seroussi]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Zukerman]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
<name>
<surname><![CDATA[Bohnert]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Authorship attribution with topic models]]></article-title>
<source><![CDATA[Computational Linguistics]]></source>
<year>2014</year>
<volume>40</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>269&#8211; 310</page-range></nlm-citation>
</ref>
<ref id="B22">
<label>22</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Stamatatos]]></surname>
<given-names><![CDATA[E]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A survey of modern authorship attribution methods]]></article-title>
<source><![CDATA[Journal of the American Society for information Science and Technology]]></source>
<year>2009</year>
<volume>60</volume>
<numero>3</numero>
<issue>3</issue>
<page-range>538&#8211;556</page-range></nlm-citation>
</ref>
<ref id="B23">
<label>23</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Stamatatos]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Daelemans]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
<name>
<surname><![CDATA[Verhoeven]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Juola]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[López-López]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Potthast]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Stein]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Overview of the Author Identification Task at PAN]]></article-title>
<source><![CDATA[CLEF Working Notes]]></source>
<year>2015</year>
</nlm-citation>
</ref>
<ref id="B24">
<label>24</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Stamatatos]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Daelemans]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
<name>
<surname><![CDATA[Verhoeven]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Juola]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[López-López]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Potthast]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Stein]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Overview of the Author Identification Task at PAN]]></article-title>
<source><![CDATA[CLEF Working Notes]]></source>
<year>2014</year>
<page-range>877&#8211;897</page-range></nlm-citation>
</ref>
<ref id="B25">
<label>25</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Verhoeven]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Daelemans]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[CLiPS Stylometry Investigation (CSI) corpus: A Dutch corpus for the detection of age, gender, personality, sentiment and deception in text]]></article-title>
<source><![CDATA[LREC]]></source>
<year>2014</year>
<page-range>3081&#8211;3085</page-range></nlm-citation>
</ref>
</ref-list>
</back>
</article>
