<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1405-5546</journal-id>
<journal-title><![CDATA[Computación y Sistemas]]></journal-title>
<abbrev-journal-title><![CDATA[Comp. y Sist.]]></abbrev-journal-title>
<issn>1405-5546</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Investigación en Computación]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1405-55462025000100481</article-id>
<article-id pub-id-type="doi">10.13053/cys-29-1-4391</article-id>
<title-group>
<article-title xml:lang="es"><![CDATA[Clasificación temática automática exhaustiva del corpus Reuters 21578 con aprendizaje automático supervisado]]></article-title>
<article-title xml:lang="en"><![CDATA[Automatic Thematic Exhaustive Classification of the Reuters 21578 Corpus Using Supervised Machine Learning]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Arengas Acosta]]></surname>
<given-names><![CDATA[Juan Manuel]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Guzmán Cabrera]]></surname>
<given-names><![CDATA[Rafael]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[López Ramírez]]></surname>
<given-names><![CDATA[Misael]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Florez Fuentes]]></surname>
<given-names><![CDATA[Anderson Smith]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,Universidad de Guanajuato Departamento de estudios multidisciplinarios ]]></institution>
<addr-line><![CDATA[ ]]></addr-line>
<country>Mexico</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>03</month>
<year>2025</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>03</month>
<year>2025</year>
</pub-date>
<volume>29</volume>
<numero>1</numero>
<fpage>481</fpage>
<lpage>499</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1405-55462025000100481&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1405-55462025000100481&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1405-55462025000100481&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="es"><p><![CDATA[Resumen: La clasificación automática de textos se ha consolidado como una disciplina de investigación que fusiona técnicas avanzadas de procesamiento de lenguaje natural (PLN) con algoritmos de aprendizaje automático, permitiendo categorizar eficientemente grandes volúmenes de documentos textuales. Se propone un enfoque innovador que integra técnicas actuales de preprocesamiento con algoritmos clásicos de aprendizaje supervisado para mejorar la precisión en la clasificación del corpus Reuters-21578. Se plantea una revisión literatura, la implementación de técnicas de preprocesamiento (tokenización, lematización, eliminación de stopwords, conversión a minúsculas y eliminación de caracteres especiales), al igual que la exploración de algoritmos de aprendizaje supervisado (Regresión Logística, Máquinas de Soporte Vectorial, Naïve Bayes, Random Forest y k-vecinos más cercanos). Se realizaron experimentos con diversas configuraciones, combinando técnicas de preprocesamiento, métodos de selección de características como TF-IDF, y los algoritmos ya mencionados. Es así como los hallazgos en los escenarios experimentados revelan la integración de estas técnicas y algoritmos mejora significativamente la precisión de la clasificación de textos, dando como resultado una configuración apta para el corpus Reuters-21578 que presenta una precisión de hasta el 98.6%. Se propone una metodología empírica rigurosa y eficaz, que puede ser aplicable a diversos corpus de documentos en formato de texto.]]></p></abstract>
<abstract abstract-type="short" xml:lang="en"><p><![CDATA[Abstract: Automatic text classification has established itself as a research discipline that merges advanced natural language processing (NLP) techniques with machine learning algorithms, allowing to efficiently categorize large volumes of textual documents. An innovative approach is proposed that integrates current preprocessing techniques with classical supervised learning algorithms to improve the classification accuracy of the Reuters-21578 corpus. A literature review, the implementation of preprocessing techniques (tokenization, lemmatization, stopword elimination, lowercase conversion and special character elimination), as well as the exploration of supervised learning algorithms (Logistic Regression, Support Vector Machines, Naïve Bayes, Random Forest and k-nearest neighbors) are proposed. Experiments were conducted with various configurations, combining preprocessing techniques, feature selection methods such as TF-IDF, and the aforementioned algorithms. Thus, the findings in the experimented scenarios reveal that the integration of these techniques and algorithms significantly improves accuracy of text classification, resulting in a configuration suitable for the Reuters-21578 corpus that presents an accuracy of up to 98.6%. A rigorous and efficient empirical methodology is proposed, which can be applicable to various document corpora in text format.]]></p></abstract>
<kwd-group>
<kwd lng="es"><![CDATA[Algoritmos de clasificación]]></kwd>
<kwd lng="es"><![CDATA[procesamiento del lenguaje natural (PLN)]]></kwd>
<kwd lng="es"><![CDATA[corpus Reuters-21578]]></kwd>
<kwd lng="es"><![CDATA[clasificación temática exhaustiva]]></kwd>
<kwd lng="en"><![CDATA[Classification algorithms]]></kwd>
<kwd lng="en"><![CDATA[natural language processing (NLP)]]></kwd>
<kwd lng="en"><![CDATA[Reuters-21578 corpus]]></kwd>
<kwd lng="en"><![CDATA[exhaustive thematic classification]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Aggarwal]]></surname>
<given-names><![CDATA[C.C.]]></given-names>
</name>
<name>
<surname><![CDATA[Zhai]]></surname>
<given-names><![CDATA[C. X.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A survey of text classification algorithms]]></article-title>
<source><![CDATA[Mining Text Data]]></source>
<year>2012</year>
<page-range>163-222</page-range><publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B2">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Manning]]></surname>
<given-names><![CDATA[C.D.]]></given-names>
</name>
<name>
<surname><![CDATA[Raghavan]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Schütze]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
</person-group>
<source><![CDATA[An Introduction to Information Retrieval]]></source>
<year>2009</year>
<publisher-loc><![CDATA[Cambridge ]]></publisher-loc>
<publisher-name><![CDATA[Cambridge University Press]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B3">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Jurafsky]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Martin]]></surname>
<given-names><![CDATA[J.H.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Speech and Language Processing an Introduction to Natural Language Processing]]></article-title>
<person-group person-group-type="editor">
<name>
<surname><![CDATA[Jurafsky]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Jurafsky]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<source><![CDATA[Computational Linguistics, and Speech Recognition]]></source>
<year>2023</year>
<volume>1</volume>
<publisher-name><![CDATA[Pearson]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B4">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kowsari]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Meimandi]]></surname>
<given-names><![CDATA[K.J.]]></given-names>
</name>
<name>
<surname><![CDATA[Heidarysafa]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Mendu]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Barnes]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
<name>
<surname><![CDATA[Brown]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Text classification algorithms: A survey]]></article-title>
<source><![CDATA[Information]]></source>
<year>2019</year>
<volume>10</volume>
<numero>4</numero>
<issue>4</issue>
<publisher-loc><![CDATA[Switzerland ]]></publisher-loc>
<publisher-name><![CDATA[MDPI]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B5">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chen]]></surname>
<given-names><![CDATA[R.C.]]></given-names>
</name>
<name>
<surname><![CDATA[Dewi]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Huang]]></surname>
<given-names><![CDATA[S.W.]]></given-names>
</name>
<name>
<surname><![CDATA[Caraka]]></surname>
<given-names><![CDATA[R.E.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Selecting critical features for data classification based on machine learning methods]]></article-title>
<source><![CDATA[Journal of Big Data]]></source>
<year>2020</year>
<volume>7</volume>
<numero>1</numero>
<issue>1</issue>
</nlm-citation>
</ref>
<ref id="B6">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Pais]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Cordeiro]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Jamil]]></surname>
<given-names><![CDATA[M.L.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[NLP-based platform as a service: A brief review]]></article-title>
<source><![CDATA[Journal of Big Data]]></source>
<year>2022</year>
<volume>9</volume>
<numero>1</numero>
<issue>1</issue>
</nlm-citation>
</ref>
<ref id="B7">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Dhar]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Mukherjee]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Dash]]></surname>
<given-names><![CDATA[N.S.]]></given-names>
</name>
<name>
<surname><![CDATA[Roy]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Text categorization: past and present]]></article-title>
<source><![CDATA[Artificial Intelligence Review]]></source>
<year>2021</year>
<volume>54</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>3007-54</page-range></nlm-citation>
</ref>
<ref id="B8">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kadhim]]></surname>
<given-names><![CDATA[A.I.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Survey on supervised machine learning techniques for automatic text classification]]></article-title>
<source><![CDATA[Artificial Intelligence Review]]></source>
<year>2019</year>
<volume>52</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>273-92</page-range></nlm-citation>
</ref>
<ref id="B9">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Alloghani]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Al-Jumeily]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Mustafina]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Hussain]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Aljaaf]]></surname>
<given-names><![CDATA[A.J.]]></given-names>
</name>
</person-group>
<source><![CDATA[A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science]]></source>
<year>2020</year>
<volume>3&#8211;21</volume>
<publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B10">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Shah]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Patel]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Sanghvi]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Shah]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification]]></article-title>
<source><![CDATA[Augmented Human Research]]></source>
<year>2020</year>
<volume>5</volume>
<numero>1</numero>
<issue>1</issue>
</nlm-citation>
</ref>
<ref id="B11">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chen]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Zhou]]></surname>
<given-names><![CDATA[L.J.]]></given-names>
</name>
<name>
<surname><![CDATA[Li]]></surname>
<given-names><![CDATA[X. Da]]></given-names>
</name>
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[J.N.]]></given-names>
</name>
<name>
<surname><![CDATA[Huo]]></surname>
<given-names><![CDATA[W. J.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[The Lao text classification method based on KNN]]></article-title>
<source><![CDATA[Procedia Computer Science]]></source>
<year>2020</year>
<volume>166</volume>
<page-range>523-8</page-range></nlm-citation>
</ref>
<ref id="B12">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bhavani]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Santhosh Kumar]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
</person-group>
<source><![CDATA[A Review of State Art of Text Classification Algorithms]]></source>
<year>2021</year>
<conf-name><![CDATA[ 5th International Conference on Computing Methodologies and Communication, ICCMC]]></conf-name>
<conf-loc> </conf-loc>
<page-range>1484-90</page-range></nlm-citation>
</ref>
<ref id="B13">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Shah]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Patel]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Sanghvi]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Shah]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification]]></article-title>
<source><![CDATA[Augmented Human Research]]></source>
<year>2020</year>
<volume>5</volume>
<numero>1</numero>
<issue>1</issue>
</nlm-citation>
</ref>
<ref id="B14">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Singh]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Gupta]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Text stemming: Approaches, applications, and challenges]]></article-title>
<source><![CDATA[ACM Computing Surveys]]></source>
<year>2016</year>
<volume>49</volume>
<numero>3</numero>
<issue>3</issue>
</nlm-citation>
</ref>
<ref id="B15">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Zhao]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
<name>
<surname><![CDATA[Chen]]></surname>
<given-names><![CDATA[J.J.]]></given-names>
</name>
<name>
<surname><![CDATA[Perkins]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Liu]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Ge]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
<name>
<surname><![CDATA[Ding]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Zou]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A heuristic approach to determine an appropriate number of topics in topic modeling]]></article-title>
<source><![CDATA[BMC Bioinformatics]]></source>
<year>2015</year>
<volume>16</volume>
<numero>13</numero>
<issue>13</issue>
</nlm-citation>
</ref>
<ref id="B16">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Li]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
</person-group>
<source><![CDATA[Automatic Classification of Chinese Long Texts Based on Deep Transfer Learning Algorithm]]></source>
<year>2021</year>
<conf-name><![CDATA[ 2nd International Conference on Artificial Intelligence and Computer Engineering (ICAICE)]]></conf-name>
<conf-date>2021</conf-date>
<conf-loc> </conf-loc>
<page-range>17-20</page-range></nlm-citation>
</ref>
<ref id="B17">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Duong]]></surname>
<given-names><![CDATA[H.-T.]]></given-names>
</name>
<name>
<surname><![CDATA[Nguyen-Thi]]></surname>
<given-names><![CDATA[T.-A.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A review: Preprocessing techniques and data augmentation for sentiment analysis]]></article-title>
<source><![CDATA[Computational Social Networks]]></source>
<year>2021</year>
<volume>8</volume>
<numero>1</numero>
<issue>1</issue>
</nlm-citation>
</ref>
<ref id="B18">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Liu]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Wu]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Liao]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Automatic Keyword Extraction from Documents Using Conditional Random Fields]]></article-title>
<source><![CDATA[Journal of Computational Information]]></source>
<year>2008</year>
<volume>4</volume>
</nlm-citation>
</ref>
<ref id="B19">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sun]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Luo]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Chen]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A review of natural language processing techniques for opinion mining systems]]></article-title>
<source><![CDATA[Information Fusion]]></source>
<year>2017</year>
<volume>36</volume>
<page-range>10-25</page-range></nlm-citation>
</ref>
<ref id="B20">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Huang]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Chen]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Zheng]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Xue]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Hu]]></surname>
<given-names><![CDATA[X.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Hierarchical multi-attention networks for document classification]]></article-title>
<source><![CDATA[International Journal of Machine Learning and Cybernetics]]></source>
<year>2021</year>
<volume>12</volume>
<numero>6</numero>
<issue>6</issue>
<page-range>1639-47</page-range></nlm-citation>
</ref>
<ref id="B21">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sebastiani]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Machine Learning in Automated Text Categorization]]></article-title>
<source><![CDATA[ACM Computing Surveys]]></source>
<year>2002</year>
<volume>34</volume>
<page-range>1-47</page-range></nlm-citation>
</ref>
<ref id="B22">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Cárdenas]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Olivares]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
<name>
<surname><![CDATA[Alfaro]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Clasificación automática de textos usando redes de palabras]]></article-title>
<source><![CDATA[Revista Signos]]></source>
<year>2014</year>
<volume>47</volume>
<numero>86</numero>
<issue>86</issue>
<page-range>346-64</page-range></nlm-citation>
</ref>
<ref id="B23">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Guardiola González]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
</person-group>
<source><![CDATA[Clasificador de textos mediante técnicas de aprendizaje automático]]></source>
<year>2020</year>
</nlm-citation>
</ref>
<ref id="B24">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Patel]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Pathak]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Khan]]></surname>
<given-names><![CDATA[M.I.]]></given-names>
</name>
</person-group>
<source><![CDATA[Automated Text Categorization]]></source>
<year>2021</year>
<conf-name><![CDATA[ 3rd International Conference on Signal Processing and Communication (ICPSC)]]></conf-name>
<conf-date>2021</conf-date>
<conf-loc> </conf-loc>
<page-range>16-20</page-range></nlm-citation>
</ref>
<ref id="B25">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Mitchell]]></surname>
<given-names><![CDATA[T.M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Machine Learning]]></source>
<year>1997</year>
<volume>1</volume>
<publisher-name><![CDATA[McGraw-Hill Science/Engineering/Math]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B26">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Quirós Díaz]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
<name>
<surname><![CDATA[Vidal]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
</person-group>
<source><![CDATA[Layout Analysis for Handwritten Documents a Probabilistic Machine Learning Approach]]></source>
<year>2021</year>
<publisher-name><![CDATA[Universitat Politécnica de Valéncia]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B27">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bishop]]></surname>
<given-names><![CDATA[C.M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Pattern Recognition and Machine Learning]]></source>
<year>2006</year>
<volume>1</volume>
<publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B28">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Weiss]]></surname>
<given-names><![CDATA[S.M.]]></given-names>
</name>
<name>
<surname><![CDATA[Indurkhya]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Damerau]]></surname>
<given-names><![CDATA[F. J.]]></given-names>
</name>
<name>
<surname><![CDATA[Weiss]]></surname>
<given-names><![CDATA[S. M.]]></given-names>
</name>
<name>
<surname><![CDATA[Indurkhya]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Damerau]]></surname>
<given-names><![CDATA[Fred J.]]></given-names>
</name>
</person-group>
<source><![CDATA[Text Mining: Predictive Methods for Analyzing Unstructured Information]]></source>
<year>2005</year>
<volume>1</volume>
<publisher-name><![CDATA[Springer Science+Business Media]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B29">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Mohan]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
<name>
<surname><![CDATA[Ilamathi]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Preprocessing Techniques for Text Mining-An Overview]]></article-title>
<source><![CDATA[International Journal of Computer Science &amp; Communication Networks]]></source>
<year>2020</year>
<volume>5</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>7-16</page-range></nlm-citation>
</ref>
<ref id="B30">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[González Escudero]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
<name>
<surname><![CDATA[Parapar López]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[Plataforma para Procesado de Lenguaje Natural como Servicio]]></source>
<year>2022</year>
<publisher-name><![CDATA[Universidad de Coruña]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B31">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Giménez Fayos]]></surname>
<given-names><![CDATA[M.T.]]></given-names>
</name>
</person-group>
<source><![CDATA[Natural Language Processing using Deep Learning in Social Media]]></source>
<year>2021</year>
<publisher-name><![CDATA[Universitat Politècnica de València]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B32">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ruiz Rico]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
</person-group>
<source><![CDATA[Selección y ponderación de características para la clasificación de textos y su aplicación en el diagnóstico médico]]></source>
<year>2013</year>
<publisher-name><![CDATA[Universidad de Alicante]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B33">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Dalaorao]]></surname>
<given-names><![CDATA[G.A.]]></given-names>
</name>
<name>
<surname><![CDATA[Sison]]></surname>
<given-names><![CDATA[A.M.]]></given-names>
</name>
<name>
<surname><![CDATA[Aguinaldo]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Medina]]></surname>
<given-names><![CDATA[R. P.]]></given-names>
</name>
</person-group>
<source><![CDATA[Integrating Collocation as TF-IDF Enhancement to Improve Classification Accuracy]]></source>
<year>2019</year>
<volume>1</volume>
<numero>1</numero>
<conf-name><![CDATA[ 13th International Conference on Telecommunication Systems, Services, and Applications (TSSA)]]></conf-name>
<conf-loc> </conf-loc>
<issue>1</issue>
<page-range>282</page-range></nlm-citation>
</ref>
<ref id="B34">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Rahmah]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Santoso]]></surname>
<given-names><![CDATA[H.B.]]></given-names>
</name>
<name>
<surname><![CDATA[Hasibuan]]></surname>
<given-names><![CDATA[Z.A.]]></given-names>
</name>
</person-group>
<source><![CDATA[Exploring Technology-Enhanced Learning Key Terms using TF-IDF Weighting]]></source>
<year>2019</year>
<page-range>1-4</page-range><publisher-name><![CDATA[IEEE]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B35">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sidorov]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
</person-group>
<source><![CDATA[Syntactic n-grams in computational linguistics]]></source>
<year>2019</year>
<publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B36">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Lojo Vicente]]></surname>
<given-names><![CDATA[J.D.]]></given-names>
</name>
<name>
<surname><![CDATA[Barreiro García]]></surname>
<given-names><![CDATA[Á.]]></given-names>
</name>
<name>
<surname><![CDATA[Losada Carril]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<source><![CDATA[Clasificación Automática de Documentación Clínica]]></source>
<year>2012</year>
<publisher-name><![CDATA[Universidad de Coruña]]></publisher-name>
</nlm-citation>
</ref>
</ref-list>
</back>
</article>
