<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1870-9044</journal-id>
<journal-title><![CDATA[Polibits]]></journal-title>
<abbrev-journal-title><![CDATA[Polibits]]></abbrev-journal-title>
<issn>1870-9044</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Innovación y Desarrollo Tecnológico en Cómputo]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1870-90442008000100004</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[Web-based Bengali News Corpus for Lexicon Development and POS Tagging]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Ekbal]]></surname>
<given-names><![CDATA[Asif]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Bandyopadhyay]]></surname>
<given-names><![CDATA[Sivaji]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
</contrib-group>
<aff id="A01">
<institution><![CDATA[,Jadavpur University Department of Computer Science ]]></institution>
<addr-line><![CDATA[Kolkata ]]></addr-line>
<country>India</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>06</month>
<year>2008</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>06</month>
<year>2008</year>
</pub-date>
<numero>37</numero>
<fpage>21</fpage>
<lpage>30</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1870-90442008000100004&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1870-90442008000100004&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1870-90442008000100004&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[Lexicon development and Part of Speech (POS) tagging are very important for almost all Natural Language Processing (NLP) applications. The rapid development of these resources and tools using machine learning techniques for less computerized languages requires appropriately tagged corpus. We have used a Bengali news corpus, developed from the web archive of a widely read Bengali newspaper. The corpus contains approximately 34 million wordforms. This corpus is used for lexicon development without employing extensive knowledge of the language. We have developed the POS taggers using Hidden Markov Model (HMM) and Support Vector Machine (SVM). The lexicon contains around 128 thousand entries and a manual check yields the accuracy of 79.6%. Initially, the POS taggers have been developed for Bengali and shown the accuracies of 85.56%, and 91.23% for HMM, and SVM, respectively. Based on the Bengali news corpus, we identify various word-level orthographic features to use in the POS taggers. The lexicon and a Named Entity Recognition (NER) system, developed using this corpus, are also used in POS tagging. The POS taggers are then evaluated with Hindi and Telugu data. Evaluation results demonstrates the fact that SVM performs better than HMM for all the three Indian languages.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[Web based corpus]]></kwd>
<kwd lng="en"><![CDATA[lexicon]]></kwd>
<kwd lng="en"><![CDATA[part of speech (POS) tagging]]></kwd>
<kwd lng="en"><![CDATA[hidden Markov model(HMM)]]></kwd>
<kwd lng="en"><![CDATA[support vector machine (SVM)]]></kwd>
<kwd lng="en"><![CDATA[Bengali]]></kwd>
<kwd lng="en"><![CDATA[Hindi]]></kwd>
<kwd lng="en"><![CDATA[Telugu]]></kwd>
</kwd-group>
</article-meta>
</front><body><![CDATA[  	    <p align="justify"><font face="verdana" size="4">Special section: natural language processing</font></p>  	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>  	    <p align="center"><font face="verdana" size="4"><b>Web&#150;based Bengali News Corpus for Lexicon Development and POS Tagging</b></font></p>  	    <p align="center"><font face="verdana" size="2">&nbsp;</font></p>  	    <p align="center"><font face="verdana" size="2"><b>Asif Ekbal and Sivaji Bandyopadhyay</b></font></p>  	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>  	    <p align="justify"><font face="verdana" size="2"><i>Department of Computer Science and Engineering, Jadavpur University, Kolkata, India 700032, e&#150;mail:</i> <a href="mailto:asif.ekbal@gmail.com">asif.ekbal@gmail.com</a>, <a href="mailto:sivaji_cse_ju@yahoo.com">sivaji_cse_ju@yahoo.com</a>.</font></p>  	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>  	    <p align="justify"><font face="verdana" size="2">Manuscript received May 4, 2008.    ]]></body>
<body><![CDATA[<br> 	Manuscript accepted for publication June 12, 2008.</font></p>  	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>  	    <p align="justify"><font face="verdana" size="2"><b>Abstract</b></font></p>  	    <p align="justify"><font face="verdana" size="2">Lexicon development and Part of Speech (POS) tagging are very important for almost all Natural Language Processing (NLP) applications. The rapid development of these resources and tools using machine learning techniques for less computerized languages requires appropriately tagged corpus. We have used a Bengali news corpus, developed from the web archive of a widely read Bengali newspaper. The corpus contains approximately 34 million wordforms. This corpus is used for lexicon development without employing extensive knowledge of the language. We have developed the POS taggers using Hidden Markov Model (HMM) and Support Vector Machine (SVM). The lexicon contains around 128 thousand entries and a manual check yields the accuracy of 79.6%. Initially, the POS taggers have been developed for Bengali and shown the accuracies of 85.56%, and 91.23% for HMM, and SVM, respectively. Based on the Bengali news corpus, we identify various word&#150;level orthographic features to use in the POS taggers. The lexicon and a Named Entity Recognition (NER) system, developed using this corpus, are also used in POS tagging. The POS taggers are then evaluated with Hindi and Telugu data. Evaluation results demonstrates the fact that SVM performs better than HMM for all the three Indian languages.</font></p>  	    <p align="justify"><font face="verdana" size="2"><b>Key words:</b> Web based corpus, lexicon, part of speech (POS) tagging, hidden Markov model(HMM), support vector machine (SVM), Bengali, Hindi, Telugu.</font></p>  	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>  	    <p align="justify"><font face="verdana" size="2"><a href="/pdf/poli/n37/n37a4.pdf" target="_blank">DESCARGAR ART&Iacute;CULO EN FORMATO PDF</a></font></p>  	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>  	    <p align="justify"><font face="verdana" size="2"><b>REFERENCES</b></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;1&#93; M. Rundell, "The Biggest Corpus of All," <i>Humanising Language Teaching,</i> vol. 2, no. 3, 2000.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040237&pid=S1870-9044200800010000400001&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;2&#93; W. H. Fletcher, "Concordancing the Web with KWiCFinder," in <i>Proceedings of the Third North American Symposium on Corpus Linguistics and Language Teaching,</i> 23&#150;25 March 2001.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040239&pid=S1870-9044200800010000400002&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;3&#93; T. Robb, "Google as a Corpus Tool?," <i>ETJ Journal,</i> vol. 4, no. 1, Spring 2003.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040241&pid=S1870-9044200800010000400003&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;4&#93; W. H. Fletcher, "Making the Web More Use&#150;ful as Source for Linguists Corpora," <i>In Ulla Conor and Thomas A. Upton (eds.), Applied Corpus Linguists: A Multidimensional Perspective,</i> pp. 191&#150;205, 2004.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040243&pid=S1870-9044200800010000400004&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;5&#93; A. Kilgarriff and G. Grefenstette, "Introduction to the Special Issue on the Web as Corpus," <i>Computational Linguistics,</i> vol. 29, no. 3, pp. 333-347, 2003.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040245&pid=S1870-9044200800010000400005&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;6&#93; A. Lenci, N. Bel, F. Busa, N. Calzolari, E. Gola, M. Monachini, A. Ogonowsky, I. Peters, W. Peters, N. Ruimy, M. Villegas, and A. Zampolli, "Simple: A General Framework for the Development of Multilingual Lexicons," <i>International Journal of Lexicography, Special Issue, Dictionaries, Thesauri and Lexical&#150;Semantic Relations,</i> vol. XIII, no. 4, pp. 249&#150;263, 2000.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040247&pid=S1870-9044200800010000400006&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;7&#93; N. Calzolari, F. Bertagna, A. Lenci, and M. Monachini, "Standards and Best Practice for Multilingual Computational Lexicons, mile (the multilingual isle lexical entry)," <i>ISLE Deliverable D2.2 &amp; 3.2,</i> 2003.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040249&pid=S1870-9044200800010000400007&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;8&#93; F. Bertagna, A.Lenci, M. Monachini, and N. Calzolari, "Content interoperability of lexical resources, open issues and 'mile' perspectives," in <i>Proceedings of the LREC 2004,</i> pp. 131&#150;134, 2004.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040251&pid=S1870-9044200800010000400008&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;9&#93; T. Takenobou, V. Sornlertlamvanich, T. Charoenporn, N. Calzolari, M. Monachini, C. Soria, C. Huang, X. YingJu, Y. Hao, L. Prevot, and S. Kiyoaki, "Infrastructure for Standardization of Asian Languages Resources," in <i>Proceedings of the COLING/ACL 2006,</i> pp. 827&#150;834, 2006.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040253&pid=S1870-9044200800010000400009&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;10&#93; D. Cutting, J. Kupiec, J. Pedersen, and P. Sibun, "A Practical Part&#150;of&#150;Speech Tagger," in <i>Proceedings of the Third Conference on Applied Natural Language Processing,</i> pp. 133&#150;140, 1992.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040255&pid=S1870-9044200800010000400010&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;11&#93; B. Merialdo, "Tagging English Text with a Probabilistic Model," <i>Computational Linguistics,</i> vol. 20, no. 2, pp. 155&#150;171, 1994.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040257&pid=S1870-9044200800010000400011&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;12&#93; T. Brants, "TnT: A Statistical Part&#150;of&#150;Speech Tagger," in <i>Proceedings of the sixth International Conference on Applied Natural Language</i> <i>Processing ANLP&#150;2000,</i> pp. 224&#150;231, 2000.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040259&pid=S1870-9044200800010000400012&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;13&#93; A. Ratnaparkhi, "A maximum entropy part&#150;of&#150;speech tagger," in <i>Proc. of EMNLP'96.,</i> 1996.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040261&pid=S1870-9044200800010000400013&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;14&#93; J. Laffertey, A. McCallum, and F. Pereira, "Conditional random fields: Probabilistic models for segmenting and labeling sequence data," in <i>Proceedings of the 18th International Conference on Machine Learning,</i> 2001.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040263&pid=S1870-9044200800010000400014&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;15&#93; T. Kudo and Y. Matsumoto, "Chunking with Support Vector Machines," in <i>Proceedings of NAACL,</i> pp. 192&#150;199, 2001.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040265&pid=S1870-9044200800010000400015&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;16&#93; S. Singh, K. Gupta, M. Shrivastava, and P. Bhattacharyya, "Morphological richness offsets resource demand&#150;experiences in constructing a pos tagger for hindi," in <i>Proceedings of the COLING/ACL 2006,</i> pp. 779-786, 2006.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040267&pid=S1870-9044200800010000400016&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;17&#93; P. Avinesh and G. Karthik, "Part Of Speech Tagging and Chunking using Conditional Random Fields and Transformation Based Learning," in <i>Proceedings of IJCAI Workshop on Shallow Parsing for South Asian</i> <i>Languages,</i> pp. 21&#150;24, 2007.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040269&pid=S1870-9044200800010000400017&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;18&#93; S. Dandapat, "Part Of Specch Tagging and Chunking with Maximum Entropy Model," in <i>Proceedings ofthe IJCAI Workshop on Shallow Parsing for South Asian Languages,</i> (Hyderabad, India), pp. 29&#150;32, 2007.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040271&pid=S1870-9044200800010000400018&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;19&#93; A. Ekbal, R. Haque, and S. Bandyopadhyay, "Maximum Entropy based Bengali Part of Speech Tagging," in <i>A. Gelbukh (Ed.), Advances in Natural Language Processing and Applications, Research in Computing Science (RCS) Journal,</i> vol. 33, pp. 67&#150;78.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040273&pid=S1870-9044200800010000400019&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;20&#93; A. Ekbal, R. Haque, and S. Bandyopadhyay, "Bengali Part of Speech Tagging using Conditional Random Field," in <i>Proceedings of the seventh International Symposium on Natural Language Processing, SNLP&#150;2007,</i> 2007.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040275&pid=S1870-9044200800010000400020&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;21&#93; A. Ekbal, R. Haque, and S. Bandyopadhyay, "Named Entity Recognition in Bengali: A Conditional Random Field Approach," in <i>Proceedings of 3rd International Joint Conference Natural Language Processing</i> <i>(IJCNLP&#150;08),</i> pp. 589&#150;594, 2008.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040277&pid=S1870-9044200800010000400021&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;22&#93; A. Ekbal and S. Bandyopadhyay, "A Web&#150;based Bengali News Corpus for Named Entity Recognition," <i>Language Resources and Evaluation Journal,</i> vol. 40, pp. 10.1007/s10579&#150;008&#150;9064&#150;x, 2008.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040279&pid=S1870-9044200800010000400022&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;23&#93; D. Jurafsky and J. H. Martin, <i>Speech and Language Processing.</i> Prentice&#150;Hall, 2000.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040281&pid=S1870-9044200800010000400023&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;24&#93; A. J. Viterbi, "Error bounds for convolutional codes and an asymptotically optimum decoding algorithm," <i>IEEE Transaction on Information Theory,</i> vol. 13, no. 2, pp. 260&#150;267, 1967.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040283&pid=S1870-9044200800010000400024&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;25&#93; V. N. Vapnik, <i>The Nature of Statistical Learning Theory.</i> New York, NY, USA: Springer&#150;Verlag New York, Inc., 1995.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040285&pid=S1870-9044200800010000400025&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;26&#93; C.C and V. N. Vapnik, "Support Vector Networks," <i>Machine Learning,</i> vol. 20, pp. 273&#150;297, 1995.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040287&pid=S1870-9044200800010000400026&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;27&#93; T. Joachims, <i>Making Large Scale SVM Learning Practical,</i> pp. 169&#150;184. Cambridge, MA, USA: MIT Press, 1999.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040289&pid=S1870-9044200800010000400027&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2">&#91;28&#93; H. Taira and M. Haruno, "Feature Selection in SVM Text Categorization," in <i>Proceedings ofAAAI&#150;99,</i> 1999.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=6040291&pid=S1870-9044200800010000400028&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>      ]]></body><back>
<ref-list>
<ref id="B1">
<label>1</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Rundell]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[The Biggest Corpus of All]]></article-title>
<source><![CDATA[Humanising Language Teaching]]></source>
<year></year>
<volume>2</volume>
<numero>3</numero>
<issue>3</issue>
</nlm-citation>
</ref>
<ref id="B2">
<label>2</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Fletcher]]></surname>
<given-names><![CDATA[W. H.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Concordancing the Web with KWiCFinder]]></article-title>
<source><![CDATA[Proceedings of the Third North American Symposium on Corpus Linguistics and Language Teaching]]></source>
<year>23-2</year>
<month>5 </month>
<day>Ma</day>
</nlm-citation>
</ref>
<ref id="B3">
<label>3</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Robb]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Google as a Corpus Tool?]]></article-title>
<source><![CDATA[ETJ Journal]]></source>
<year>2003</year>
<volume>4</volume>
<numero>1</numero>
<issue>1</issue>
<publisher-name><![CDATA[Spring]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B4">
<label>4</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Fletcher]]></surname>
<given-names><![CDATA[W. H.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Making the Web More Use-ful as Source for Linguists Corpora]]></article-title>
<person-group person-group-type="editor">
<name>
<surname><![CDATA[Conor]]></surname>
<given-names><![CDATA[Ulla]]></given-names>
</name>
<name>
<surname><![CDATA[Upton]]></surname>
<given-names><![CDATA[Thomas A.]]></given-names>
</name>
</person-group>
<source><![CDATA[Applied Corpus Linguists: A Multidimensional Perspective]]></source>
<year>2004</year>
<page-range>191-205</page-range></nlm-citation>
</ref>
<ref id="B5">
<label>5</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kilgarriff]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Grefenstette]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Introduction to the Special Issue on the Web as Corpus]]></article-title>
<source><![CDATA[Computational Linguistics]]></source>
<year>2003</year>
<volume>29</volume>
<numero>3</numero>
<issue>3</issue>
<page-range>333-347</page-range></nlm-citation>
</ref>
<ref id="B6">
<label>6</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Lenci]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Bel]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Busa]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Calzolari]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Gola]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Monachini]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Ogonowsky]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Peters]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
<name>
<surname><![CDATA[Peters]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
<name>
<surname><![CDATA[Ruimy]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Villegas]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Zampolli]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Simple: A General Framework for the Development of Multilingual Lexicons]]></article-title>
<source><![CDATA[International Journal of Lexicography]]></source>
<year>2000</year>
<volume>XIII</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>249-263</page-range></nlm-citation>
</ref>
<ref id="B7">
<label>7</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Calzolari]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Bertagna]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Lenci]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Monachini]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Standards and Best Practice for Multilingual Computational Lexicons, mile (the multilingual isle lexical entry)]]></source>
<year>2003</year>
</nlm-citation>
</ref>
<ref id="B8">
<label>8</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bertagna]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Lenci]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Monachini]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Calzolari]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Content interoperability of lexical resources, open issues and 'mile' perspectives]]></article-title>
<source><![CDATA[Proceedings of the LREC 2004]]></source>
<year>2004</year>
<page-range>131-134</page-range></nlm-citation>
</ref>
<ref id="B9">
<label>9</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Takenobou]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Sornlertlamvanich]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
<name>
<surname><![CDATA[Charoenporn]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Calzolari]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Monachini]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Soria]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Huang]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[YingJu]]></surname>
<given-names><![CDATA[X.]]></given-names>
</name>
<name>
<surname><![CDATA[Hao]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Prevot]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
<name>
<surname><![CDATA[Kiyoaki]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Infrastructure for Standardization of Asian Languages Resources]]></article-title>
<source><![CDATA[Proceedings of the COLING/ACL 2006]]></source>
<year>2006</year>
<page-range>827-834</page-range></nlm-citation>
</ref>
<ref id="B10">
<label>10</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Cutting]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Kupiec]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Pedersen]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Sibun]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[A Practical Part-of-Speech Tagger]]></article-title>
<source><![CDATA[Proceedings of the Third Conference on Applied Natural Language Processing]]></source>
<year>1992</year>
<page-range>133-140</page-range></nlm-citation>
</ref>
<ref id="B11">
<label>11</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Merialdo]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Tagging English Text with a Probabilistic Model]]></article-title>
<source><![CDATA[Computational Linguistics]]></source>
<year>1994</year>
<volume>20</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>155-171</page-range></nlm-citation>
</ref>
<ref id="B12">
<label>12</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Brants]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[TnT: A Statistical Part-of-Speech Tagger]]></article-title>
<source><![CDATA[Proceedings of the sixth International Conference on Applied Natural Language Processing ANLP-2000]]></source>
<year>2000</year>
<page-range>224-231</page-range></nlm-citation>
</ref>
<ref id="B13">
<label>13</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ratnaparkhi]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[A maximum entropy part-of-speech tagger]]></article-title>
<source><![CDATA[Proc. of EMNLP'96]]></source>
<year>1996</year>
</nlm-citation>
</ref>
<ref id="B14">
<label>14</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Laffertey]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[McCallum]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Pereira]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Conditional random fields: Probabilistic models for segmenting and labeling sequence data]]></article-title>
<source><![CDATA[Proceedings of the 18th International Conference on Machine Learning]]></source>
<year>2001</year>
</nlm-citation>
</ref>
<ref id="B15">
<label>15</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kudo]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Matsumoto]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Chunking with Support Vector Machines]]></article-title>
<source><![CDATA[Proceedings of NAACL]]></source>
<year>2001</year>
<page-range>192-199</page-range></nlm-citation>
</ref>
<ref id="B16">
<label>16</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Singh]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Gupta]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Shrivastava]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Bhattacharyya]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Morphological richness offsets resource demand-experiences in constructing a pos tagger for hindi]]></article-title>
<source><![CDATA[Proceedings of the COLING/ACL 2006]]></source>
<year>2006</year>
<page-range>779-786</page-range></nlm-citation>
</ref>
<ref id="B17">
<label>17</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Avinesh]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Karthik]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Part Of Speech Tagging and Chunking using Conditional Random Fields and Transformation Based Learning]]></article-title>
<source><![CDATA[Proceedings of IJCAI Workshop on Shallow Parsing for South Asian Languages]]></source>
<year>2007</year>
<page-range>21-24</page-range></nlm-citation>
</ref>
<ref id="B18">
<label>18</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Dandapat]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Part Of Specch Tagging and Chunking with Maximum Entropy Model]]></article-title>
<source><![CDATA[Proceedings ofthe IJCAI Workshop on Shallow Parsing for South Asian Languages, (Hyderabad, India)]]></source>
<year>2007</year>
<page-range>29-32</page-range></nlm-citation>
</ref>
<ref id="B19">
<label>19</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ekbal]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Haque]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Bandyopadhyay]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Maximum Entropy based Bengali Part of Speech Tagging]]></article-title>
<person-group person-group-type="editor">
<name>
<surname><![CDATA[Gelbukh]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<source><![CDATA[Advances in Natural Language Processing and Applications, Research in Computing Science (RCS) Journal]]></source>
<year></year>
<volume>33</volume>
<page-range>67-78</page-range></nlm-citation>
</ref>
<ref id="B20">
<label>20</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ekbal]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Haque]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Bandyopadhyay]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Bengali Part of Speech Tagging using Conditional Random Field]]></article-title>
<source><![CDATA[Proceedings of the seventh International Symposium on Natural Language Processing, SNLP-2007]]></source>
<year>2007</year>
</nlm-citation>
</ref>
<ref id="B21">
<label>21</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ekbal]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Haque]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Bandyopadhyay]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Named Entity Recognition in Bengali: A Conditional Random Field Approach]]></article-title>
<source><![CDATA[Proceedings of 3rd International Joint Conference Natural Language Processing (IJCNLP-08)]]></source>
<year>2008</year>
<page-range>589-594</page-range></nlm-citation>
</ref>
<ref id="B22">
<label>22</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ekbal]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Bandyopadhyay]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[A Web-based Bengali News Corpus for Named Entity Recognition]]></article-title>
<source><![CDATA[Language Resources and Evaluation Journal]]></source>
<year>2008</year>
<volume>40</volume>
</nlm-citation>
</ref>
<ref id="B23">
<label>23</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Jurafsky]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Martin]]></surname>
<given-names><![CDATA[J. H.]]></given-names>
</name>
</person-group>
<source><![CDATA[Speech and Language Processing]]></source>
<year>2000</year>
<publisher-name><![CDATA[Prentice-Hall]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B24">
<label>24</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Viterbi]]></surname>
<given-names><![CDATA[A. J.]]></given-names>
</name>
</person-group>
<source><![CDATA[Error bounds for convolutional codes and an asymptotically optimum decoding algorithm]]></source>
<year>1967</year>
<volume>13</volume>
<page-range>260-267</page-range><publisher-name><![CDATA[IEEE Transaction on Information Theory]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B25">
<label>25</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Vapnik]]></surname>
<given-names><![CDATA[V. N.]]></given-names>
</name>
</person-group>
<source><![CDATA[The Nature of Statistical Learning Theory]]></source>
<year>1995</year>
<publisher-loc><![CDATA[New York^eNY NY]]></publisher-loc>
<publisher-name><![CDATA[Springer-Verlag New York, Inc.]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B26">
<label>26</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Vapnik]]></surname>
<given-names><![CDATA[V. N.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Support Vector Networks]]></article-title>
<source><![CDATA[Machine Learning]]></source>
<year>1995</year>
<volume>20</volume>
<page-range>273-297</page-range></nlm-citation>
</ref>
<ref id="B27">
<label>27</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Joachims]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<source><![CDATA[Making Large Scale SVM Learning Practical]]></source>
<year>1999</year>
<page-range>169-184</page-range><publisher-loc><![CDATA[Cambridge^eMA MA]]></publisher-loc>
<publisher-name><![CDATA[MIT Press]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B28">
<label>28</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Taira]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Haruno]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Feature Selection in SVM Text Categorization]]></source>
<year>1999</year>
</nlm-citation>
</ref>
</ref-list>
</back>
</article>
