<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1870-9044</journal-id>
<journal-title><![CDATA[Polibits]]></journal-title>
<abbrev-journal-title><![CDATA[Polibits]]></abbrev-journal-title>
<issn>1870-9044</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Innovación y Desarrollo Tecnológico en Cómputo]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1870-90442011000200013</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[Inference of Fine-grained Attributes of Bengali Corpus for Stylometry Detection]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Chakraborty]]></surname>
<given-names><![CDATA[Tanmoy]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Bandyopadhyay]]></surname>
<given-names><![CDATA[Sivaji]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
</contrib-group>
<aff id="A01">
<institution><![CDATA[,Jadavpur University Department of Computer Science and Engineering ]]></institution>
<addr-line><![CDATA[Kolkata ]]></addr-line>
<country>India</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>12</month>
<year>2011</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>12</month>
<year>2011</year>
</pub-date>
<numero>44</numero>
<fpage>79</fpage>
<lpage>83</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1870-90442011000200013&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1870-90442011000200013&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1870-90442011000200013&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[Stylometry, the science of inferring characteristics of the author from the characteristics of documents written by that author, is a problem with a long history and belongs to the core task of Text categorization that involves authorship identification, plagiarism detection, forensic investigation, computer security, copyright and estáte disputes etc. In this work, we present a strategy for stylometry detection of documents written in Bengali. We adopt a set of fine-grained attribute features with a set of lexical markers for the analysis of the text and use three semi-supervised measures for making decisions. Finally, a majority voting approach has been taken for final classification. The system is fully automatic and language-independent. Evaluation results of our attempt for Bengali author' s stylometry detection show reasonably promising accuracy in comparison to the baseline model.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[Stylometry]]></kwd>
<kwd lng="en"><![CDATA[stylistic markers]]></kwd>
<kwd lng="en"><![CDATA[cosine-similarity]]></kwd>
<kwd lng="en"><![CDATA[chi-square measure]]></kwd>
<kwd lng="en"><![CDATA[Euclidean distance]]></kwd>
</kwd-group>
</article-meta>
</front><body><![CDATA[  	    <p align="center"><font face="verdana" size="4"><b>Inference of Fine&#150;grained Attributes of Bengali Corpus for Stylometry Detection</b></font></p>  	    <p align="center"><font face="verdana" size="2">&nbsp;</font></p> 	    <p align="center"><font face="verdana" size="2"><b>Tanmoy Chakraborty* and Sivaji Bandyopadhyay**</b></font></p> 	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p> 	    <p align="justify"><font face="verdana" size="2"><i>Department of Computer Science and Engineering, Jadavpur University, Kolkata, India</i> (e&#150;mail: *<a href="mailto:its_tanmoy@yahoo.co.in">its_tanmoy@yahoo.co.in</a>, **<a href="mailto:sivaji_cse_ju@yahoo.com">sivaji_cse_ju@yahoo.com</a>).</font></p> 	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p> 	    <p align="justify"><font face="verdana" size="2">Manuscript received November 7, 2010.    <br>     Manuscript accepted for publication February 6, 2011.</font></p> 	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p> 	    ]]></body>
<body><![CDATA[<p align="justify"><font face="verdana" size="2"><b>Abstract</b></font></p> 	    <p align="justify"><font face="verdana" size="2">Stylometry, the science of inferring characteristics of the author from the characteristics of documents written by that author, is a problem with a long history and belongs to the core task of Text categorization that involves authorship identification, plagiarism detection, forensic investigation, computer security, copyright and est&aacute;te disputes etc. In this work, we present a strategy for stylometry detection of documents written in Bengali. We adopt a set of fine&#150;grained attribute features with a set of lexical markers for the analysis of the text and use three semi&#150;supervised measures for making decisions. Finally, a majority voting approach has been taken for final classification. The system is fully automatic and language&#150;independent. Evaluation results of our attempt for Bengali author' s stylometry detection show reasonably promising accuracy in comparison to the baseline model.</font></p> 	    <p align="justify"><font face="verdana" size="2"><b>Key words:</b> Stylometry, stylistic markers, cosine&#150;similarity, chi&#150;square measure, Euclidean distance.</font></p> 	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p> 	    <p align="justify"><font face="verdana" size="2"><a href="/pdf/poli/n44/n44a13.pdf" target="_blank">DESCARGAR ART&Iacute;CULO EN FORMATO PDF</a></font></p> 	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p> 	    <p align="justify"><font face="verdana" size="2"><b>REFERENCES</b></font></p> 	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;1&#93; D. Holmes, "Authorship Attribution," <i>Computers and the Humanities,</i> 28, 87&#150;106, 1994.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6083121&pid=S1870-9044201100020001300001&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p> 	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;2&#93; D.J. Croft, "Book of Mormon word prints reexamined," <i>Sunstone Publish.,</i> 6, 15&#150;22, 1981.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6083123&pid=S1870-9044201100020001300002&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p> 	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;3&#93; D. Pavelec, E. Justino, and L.S. Oliveira, "Author Identification using Stylometric features," <i>Inteligencia Artificial, Revista Ideroamericana de Inteligencia Artifical,</i> 11, 59&#150;65, 2007.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6083125&pid=S1870-9044201100020001300003&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p> 	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;4&#93; E. Stamatatos, N. Fakotakis, and G. Kokkinakis, "Automatic authorship attribution," in <i>Proc. of the 9th Conference on European Chapter of the ACL,</i> 1999, pp. 158&#150;165.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6083127&pid=S1870-9044201100020001300004&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p> 	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;5&#93; K. H. Krippendorf, <i>Conten&iacute; Analysis&#150;An Introduction to its Methodology,</i> Sage Publications Inc., 2nd Edition, 440 p., 2003.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6083129&pid=S1870-9044201100020001300005&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p> 	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;6&#93; M.B. Malyutov, "Authorship attribution of texts: A review," <i>Lecture Notes in Computer Science,</i> vol. 4123, 362&#150;380, 2006.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6083131&pid=S1870-9044201100020001300006&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p> 	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;7&#93; S. Argamon, M. Saric, and S.S. Stien, "Style mining of electronic messages for m&uacute;ltiple authorship discrimination: First results," in <i>Proc. 9th ACM SIGKDD,</i> 2003, pp. 475&#150;180.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6083133&pid=S1870-9044201100020001300007&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p> 	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;8&#93; T.K. Mustafa, N. Mustapha, M.A. Azmi, and N.B. Sulaiman, "Dropping down the M&aacute;ximum Item Set: Improving the Stylometric Authorship Attribution Algorithm in the Text Mining for Authorship Investigation," <i>Journal of Computer Science,</i> 6 (3), 235&#150;243, 2010.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6083135&pid=S1870-9044201100020001300008&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p> 	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;9&#93; T. Zhang, F. Damerau, and D. Johnson, "Text chunking using regularized winnow," in <i>Proc. 39th Annual Meeting onACL,</i> 2002, pp. 539&#150;546.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6083137&pid=S1870-9044201100020001300009&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p> 	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;10&#93; T. Chakraborty and S. Bandyopadhyay, "Identification of Reduplication in Bengali Corpus and their semantic Analysis: A Rule Based Approach," in <i>Proc. of the COLINO (MWE 2010),</i> Beijing, 2010, pp. 72&#150;75.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6083139&pid=S1870-9044201100020001300010&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p> 	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;11&#93; V. H. Halteren, "Linguistic profiling for author recognition and verification," in<i> Proceedings of the 2005 Meeting of the Association for Computational Linguistics (ACL),</i> 2005.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6083141&pid=S1870-9044201100020001300011&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p> 	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;12&#93; S. Argamon, M. Saric, and S. S. Stein, "Style mining of electronic messages for multiple authorship discrimination: First results," in <i>Proceedings of the 2003 Association for Computing Machinery Conference on Knowledge Discovery and Data Mining (ACM SIGKDD),</i> 2003, pp. 475&#150;480.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6083143&pid=S1870-9044201100020001300012&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p> 	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;13&#93; D. Madigan, A. Genkin, D. D. Lewis, S. Argamon, D. Fradkin, and L. Ye, "Author identification on the large scale," in <i>Proceedings of the 2005 Meeting of the Classification Society of North America (CSNA),</i> 2005.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6083145&pid=S1870-9044201100020001300013&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p> 	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;14&#93; M. Koppel, J. Schler, and E. Bonchek&#150;Dokow, "Measuring differentiability: Unmasking pseudonymous authors," <i>Journal of Machine Learning Research,</i> 8, 1261&#150;1276, 2007.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6083147&pid=S1870-9044201100020001300014&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>      ]]></body><back>
<ref-list>
<ref id="B1">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Holmes]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Authorship Attribution,]]></article-title>
<source><![CDATA[Computers and the Humanities]]></source>
<year>1994</year>
<volume>28</volume>
<page-range>87-106</page-range></nlm-citation>
</ref>
<ref id="B2">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Croft]]></surname>
<given-names><![CDATA[D.J.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Book of Mormon word prints reexamined,]]></article-title>
<source><![CDATA[Sunstone Publish.]]></source>
<year>1981</year>
<volume>6</volume>
<page-range>15-22</page-range></nlm-citation>
</ref>
<ref id="B3">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Pavelec]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Justino]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Oliveira]]></surname>
<given-names><![CDATA[L.S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Author Identification using Stylometric features,]]></article-title>
<source><![CDATA[Inteligencia Artificial, Revista Ideroamericana de Inteligencia Artifical]]></source>
<year>2007</year>
<volume>11</volume>
<page-range>59-65</page-range></nlm-citation>
</ref>
<ref id="B4">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Stamatatos]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Fakotakis]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Kokkinakis]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Automatic authorship attribution,]]></article-title>
<source><![CDATA[Proc. of the 9th Conference on European Chapter of the ACL]]></source>
<year>1999</year>
<page-range>158-165</page-range></nlm-citation>
</ref>
<ref id="B5">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Krippendorf]]></surname>
<given-names><![CDATA[K. H.]]></given-names>
</name>
</person-group>
<source><![CDATA[Contení Analysis-An Introduction to its Methodology]]></source>
<year>2003</year>
<edition>2nd</edition>
<page-range>440</page-range><publisher-name><![CDATA[Sage Publications Inc.]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B6">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Malyutov]]></surname>
<given-names><![CDATA[M.B.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Authorship attribution of texts: A review,]]></article-title>
<source><![CDATA[Lecture Notes in Computer Science]]></source>
<year>2006</year>
<volume>4123</volume>
<page-range>362-380</page-range></nlm-citation>
</ref>
<ref id="B7">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Argamon]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Saric]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Stien]]></surname>
<given-names><![CDATA[S.S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Style mining of electronic messages for múltiple authorship discrimination: First results,]]></article-title>
<source><![CDATA[Proc. 9th ACM SIGKDD]]></source>
<year>2003</year>
<page-range>475-180</page-range></nlm-citation>
</ref>
<ref id="B8">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Mustafa]]></surname>
<given-names><![CDATA[T.K.]]></given-names>
</name>
<name>
<surname><![CDATA[Mustapha]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Azmi]]></surname>
<given-names><![CDATA[M.A.]]></given-names>
</name>
<name>
<surname><![CDATA[Sulaiman]]></surname>
<given-names><![CDATA[N.B.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Dropping down the Máximum Item Set: Improving the Stylometric Authorship Attribution Algorithm in the Text Mining for Authorship Investigation,]]></article-title>
<source><![CDATA[Journal of Computer Science]]></source>
<year>2010</year>
<volume>6</volume>
<numero>3</numero>
<issue>3</issue>
<page-range>235-243</page-range></nlm-citation>
</ref>
<ref id="B9">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Damerau]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Johnson]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Text chunking using regularized winnow,]]></article-title>
<source><![CDATA[Proc. 39th Annual Meeting onACL]]></source>
<year>2002</year>
<page-range>539-546</page-range></nlm-citation>
</ref>
<ref id="B10">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chakraborty]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Bandyopadhyay]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Identification of Reduplication in Bengali Corpus and their semantic Analysis: A Rule Based Approach,]]></article-title>
<source><![CDATA[Proc. of the COLINO (MWE 2010)]]></source>
<year>2010</year>
<page-range>72-75</page-range><publisher-loc><![CDATA[Beijing ]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B11">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Halteren]]></surname>
<given-names><![CDATA[V. H.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Linguistic profiling for author recognition and verification,]]></article-title>
<source><![CDATA[Proceedings of the 2005 Meeting of the Association for Computational Linguistics (ACL)]]></source>
<year>2005</year>
</nlm-citation>
</ref>
<ref id="B12">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Argamon]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Saric]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Stein]]></surname>
<given-names><![CDATA[S. S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Style mining of electronic messages for multiple authorship discrimination: First results,]]></article-title>
<source><![CDATA[Proceedings of the 2003 Association for Computing Machinery Conference on Knowledge Discovery and Data Mining (ACM SIGKDD)]]></source>
<year>2003</year>
<page-range>475-480</page-range></nlm-citation>
</ref>
<ref id="B13">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Madigan]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Genkin]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Lewis]]></surname>
<given-names><![CDATA[D. D.]]></given-names>
</name>
<name>
<surname><![CDATA[Argamon]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Fradkin]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Ye]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Author identification on the large scale,]]></article-title>
<source><![CDATA[Proceedings of the 2005 Meeting of the Classification Society of North America (CSNA)]]></source>
<year>2005</year>
</nlm-citation>
</ref>
<ref id="B14">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Koppel]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Schler]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Bonchek-Dokow]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Measuring differentiability: Unmasking pseudonymous authors,]]></article-title>
<source><![CDATA[Journal of Machine Learning Research]]></source>
<year>2007</year>
<volume>8</volume>
<page-range>1261-1276</page-range></nlm-citation>
</ref>
</ref-list>
</back>
</article>
