<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1405-5546</journal-id>
<journal-title><![CDATA[Computación y Sistemas]]></journal-title>
<abbrev-journal-title><![CDATA[Comp. y Sist.]]></abbrev-journal-title>
<issn>1405-5546</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Investigación en Computación]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1405-55462019000301089</article-id>
<article-id pub-id-type="doi">10.13053/cys-23-3-3281</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[Cross-Domain Failures of Fake News Detection]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Janicka]]></surname>
<given-names><![CDATA[Maria]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Pszona]]></surname>
<given-names><![CDATA[Maria]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Wawer]]></surname>
<given-names><![CDATA[Aleksander]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,Samsung R&amp;D Institute Poland  ]]></institution>
<addr-line><![CDATA[ ]]></addr-line>
<country>Poland</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>09</month>
<year>2019</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>09</month>
<year>2019</year>
</pub-date>
<volume>23</volume>
<numero>3</numero>
<fpage>1089</fpage>
<lpage>1097</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1405-55462019000301089&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1405-55462019000301089&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1405-55462019000301089&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[Abstract Fake news recognition has become a prominent research topic in natural language processing. Researchers reported significant successes when applying methods based on various stylometric and lexical features and machine learning, with accuracy reaching 90%. This article is focused on answering the question: are the fake news detection models universally applicable or limited to the domain they have been trained on? We used four different, freely available English language Fake News corpora and trained models in both in-domain and cross-domain setting. We also explored and compared features important in each domain. We found that the performance in cross-domain setting degrades by 20% and sets of features important to detect fake texts differ between domains. Our conclusions support the hypothesis that high accuracy of machine learning models applied to fake news detection may be related to over-fitting, and models need to be trained and evaluated on mixed types of texts.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[Fake news detection]]></kwd>
<kwd lng="en"><![CDATA[cross-domain]]></kwd>
<kwd lng="en"><![CDATA[cross-domain failures]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<label>1</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Anderson]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Lix and rix: Variations on a little-known readability index]]></article-title>
<source><![CDATA[Journal of Reading]]></source>
<year>1983</year>
<volume>26</volume>
<numero>6</numero>
<issue>6</issue>
<page-range>490-6</page-range></nlm-citation>
</ref>
<ref id="B2">
<label>2</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bond]]></surname>
<given-names><![CDATA[G. D.]]></given-names>
</name>
<name>
<surname><![CDATA[Lee]]></surname>
<given-names><![CDATA[A. Y.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Language of lies in prison: Linguistic classification of prisoners' truthful and deceptive natural language]]></article-title>
<source><![CDATA[Applied Cognitive Psychology]]></source>
<year>2005</year>
<volume>19</volume>
<numero>3</numero>
<issue>3</issue>
<page-range>313-29</page-range></nlm-citation>
</ref>
<ref id="B3">
<label>3</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chall]]></surname>
<given-names><![CDATA[J. S.]]></given-names>
</name>
<name>
<surname><![CDATA[Dale]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
</person-group>
<source><![CDATA[Readability revisited: The new Dale-Chall readability formula]]></source>
<year>1995</year>
<publisher-name><![CDATA[Brookline Books]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B4">
<label>4</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Coleman]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Liau]]></surname>
<given-names><![CDATA[T. L.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A computer readability formula designed for machine scoring]]></article-title>
<source><![CDATA[Journal of Applied Psychology]]></source>
<year>1975</year>
<volume>60</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>283</page-range></nlm-citation>
</ref>
<ref id="B5">
<label>5</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Del Vicario]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Bessi]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Zollo]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Petroni]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Scala]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Caldarelli]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
<name>
<surname><![CDATA[Stanley]]></surname>
<given-names><![CDATA[H. E.]]></given-names>
</name>
<name>
<surname><![CDATA[Quattrociocchi]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[The spreading of misinformation online]]></article-title>
<source><![CDATA[Proceedings of the National Academy of Sciences]]></source>
<year>2016</year>
<volume>113</volume>
<numero>3</numero>
<issue>3</issue>
<page-range>554-9</page-range></nlm-citation>
</ref>
<ref id="B6">
<label>6</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Flesch]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Flesch-Kincaid readability test]]></article-title>
<source><![CDATA[Retrieved October]]></source>
<year>2007</year>
<volume>26</volume>
<page-range>2007</page-range></nlm-citation>
</ref>
<ref id="B7">
<label>7</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Gunning]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[The fog index after twenty years]]></article-title>
<source><![CDATA[Journal of Business Communication]]></source>
<year>1969</year>
<volume>6</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>3-13</page-range></nlm-citation>
</ref>
<ref id="B8">
<label>8</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Jin]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Cao]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Zhou]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Tian]]></surname>
<given-names><![CDATA[Q.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Novel visual and statistical image features for microblogs news verification]]></article-title>
<source><![CDATA[IEEE transactions on multimedia]]></source>
<year>2017</year>
<volume>19</volume>
<numero>3</numero>
<issue>3</issue>
<page-range>598-608</page-range></nlm-citation>
</ref>
<ref id="B9">
<label>9</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kelly]]></surname>
<given-names><![CDATA[E. F.]]></given-names>
</name>
<name>
<surname><![CDATA[Stone]]></surname>
<given-names><![CDATA[P. J.]]></given-names>
</name>
</person-group>
<source><![CDATA[Computer recognition of English word senses]]></source>
<year>1975</year>
<volume>13</volume>
<publisher-loc><![CDATA[North-Holland ]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B10">
<label>10</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Lasswell]]></surname>
<given-names><![CDATA[H. D.]]></given-names>
</name>
<name>
<surname><![CDATA[Namenwirth]]></surname>
<given-names><![CDATA[J. Z.]]></given-names>
</name>
</person-group>
<source><![CDATA[The Lasswell value dictionary]]></source>
<year>1969</year>
<publisher-name><![CDATA[New Haven]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B11">
<label>11</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Mc Laughlin]]></surname>
<given-names><![CDATA[G. H.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[SMOG grading - a new readability formula]]></article-title>
<source><![CDATA[Journal of reading]]></source>
<year>1969</year>
<volume>12</volume>
<numero>8</numero>
<issue>8</issue>
<page-range>639-46</page-range></nlm-citation>
</ref>
<ref id="B12">
<label>12</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Newman]]></surname>
<given-names><![CDATA[M. L.]]></given-names>
</name>
<name>
<surname><![CDATA[Pennebaker]]></surname>
<given-names><![CDATA[J. W.]]></given-names>
</name>
<name>
<surname><![CDATA[Berry]]></surname>
<given-names><![CDATA[D. S.]]></given-names>
</name>
<name>
<surname><![CDATA[Richards]]></surname>
<given-names><![CDATA[J. M.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Lying words: Predicting deception from linguistic styles]]></article-title>
<source><![CDATA[Personality and Social Psychology Bulletin]]></source>
<year>2003</year>
<volume>29</volume>
<numero>5</numero>
<issue>5</issue>
<page-range>665-75</page-range></nlm-citation>
</ref>
<ref id="B13">
<label>13</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ott]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Choi]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Cardie]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Hancock]]></surname>
<given-names><![CDATA[J. T.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Finding deceptive opinion spam by any stretch of the imagination]]></article-title>
<source><![CDATA[Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1]]></source>
<year>2011</year>
<page-range>309-19</page-range><publisher-name><![CDATA[Association for Computational Linguistics]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B14">
<label>14</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Pennebaker]]></surname>
<given-names><![CDATA[J. W.]]></given-names>
</name>
<name>
<surname><![CDATA[Francis]]></surname>
<given-names><![CDATA[M. E.]]></given-names>
</name>
<name>
<surname><![CDATA[Booth]]></surname>
<given-names><![CDATA[R. J.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Linguistic inquiry and word count: LIWC 2001]]></article-title>
<source><![CDATA[Mahway: Lawrence Erlbaum Associates]]></source>
<year>2001</year>
<volume>71</volume>
<numero>2001</numero>
<issue>2001</issue>
<page-range>2001</page-range></nlm-citation>
</ref>
<ref id="B15">
<label>15</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Pérez-Rosas]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
<name>
<surname><![CDATA[Kleinberg]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Lefevre]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Mihalcea]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<source><![CDATA[Automatic detection of fake news]]></source>
<year>2017</year>
</nlm-citation>
</ref>
<ref id="B16">
<label>16</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Pérez-Rosas]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
<name>
<surname><![CDATA[Kleinberg]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Lefevre]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Mihalcea]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Automatic detection of fake news]]></article-title>
<source><![CDATA[Proceedings of the 27th International Conference on Computational Linguistics]]></source>
<year>2018</year>
<page-range>3391-401</page-range><publisher-name><![CDATA[Association for Computational Linguistics]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B17">
<label>17</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Semin]]></surname>
<given-names><![CDATA[G. R.]]></given-names>
</name>
<name>
<surname><![CDATA[Fiedler]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[The cognitive functions of linguistic categories in describing persons: Social cognition and language]]></article-title>
<source><![CDATA[Journal of Personality and Social Psychology]]></source>
<year>1988</year>
<volume>54</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>558</page-range></nlm-citation>
</ref>
<ref id="B18">
<label>18</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Senter]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Smith]]></surname>
<given-names><![CDATA[E. A.]]></given-names>
</name>
</person-group>
<source><![CDATA[Automated readability index]]></source>
<year>1967</year>
<publisher-loc><![CDATA[OH ]]></publisher-loc>
<publisher-name><![CDATA[Cincinnati university]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B19">
<label>19</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Stone]]></surname>
<given-names><![CDATA[P. J.]]></given-names>
</name>
<name>
<surname><![CDATA[Dunphy]]></surname>
<given-names><![CDATA[D. C.]]></given-names>
</name>
<name>
<surname><![CDATA[Smith]]></surname>
<given-names><![CDATA[M. S.]]></given-names>
</name>
</person-group>
<source><![CDATA[The general inquirer: A computer approach to content analysis]]></source>
<year>1966</year>
<publisher-name><![CDATA[MIT press]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B20">
<label>20</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[W. Y.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA["Liar, Liar Pants on Fire": A new benchmark dataset for fake news detection]]></article-title>
<source><![CDATA[Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)]]></source>
<year>2017</year>
<page-range>422-6</page-range><publisher-name><![CDATA[Association for Computational Linguistics]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B21">
<label>21</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Cui]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
<name>
<surname><![CDATA[Fu]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Gouza]]></surname>
<given-names><![CDATA[F. B.]]></given-names>
</name>
</person-group>
<source><![CDATA[Fake news detection with deep diffusive network model]]></source>
<year>2018</year>
</nlm-citation>
</ref>
<ref id="B22">
<label>22</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Zhao]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Zhao]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Sano]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Levy]]></surname>
<given-names><![CDATA[O.]]></given-names>
</name>
<name>
<surname><![CDATA[Takayasu]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Takayasu]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Li]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Havlin]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<source><![CDATA[Fake news propagate differently from real news even at early stages of spreading]]></source>
<year>2018</year>
</nlm-citation>
</ref>
</ref-list>
</back>
</article>
