<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1405-5546</journal-id>
<journal-title><![CDATA[Computación y Sistemas]]></journal-title>
<abbrev-journal-title><![CDATA[Comp. y Sist.]]></abbrev-journal-title>
<issn>1405-5546</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Investigación en Computación]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1405-55462022000100281</article-id>
<article-id pub-id-type="doi">10.13053/cys-26-1-4171</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[Application of the LDA Model for Obtaining Topics from the WIKICORPUS]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Martínez Guzmán]]></surname>
<given-names><![CDATA[Gerardo]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Bernábe Loranca]]></surname>
<given-names><![CDATA[María Beatriz]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Cerón Garnica]]></surname>
<given-names><![CDATA[Carmen]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Serrano Pérez]]></surname>
<given-names><![CDATA[Jonathan]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Archundia Sierra]]></surname>
<given-names><![CDATA[Etelvina]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,Benemérita Universidad Autónoma de Puebla Facultad de Ciencias de la Computación ]]></institution>
<addr-line><![CDATA[ ]]></addr-line>
<country>Mexico</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>03</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>03</month>
<year>2022</year>
</pub-date>
<volume>26</volume>
<numero>1</numero>
<fpage>281</fpage>
<lpage>293</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1405-55462022000100281&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1405-55462022000100281&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1405-55462022000100281&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[Abstract: A fundamental problem in text analysis of great amount of information is to discover the topics described in the documents. One of the most useful application involves the extraction of topics from documents corpus. Such is the case of Wikicorpus that consists of approximately 250,000 documents totaling in 250 millions of words. In this work, a system based on the Latent Dirichlet Allocation (LDA) model has been developed to carry out the task of automatically selecting the words of the corpus and, based on their frequency in the documents, it would indicate that they may or not belong to certain topic, classifying words without human intervention. Due to the large amount of information of the corpus, a Serial-Parallel Algorithm (SPA) in C/C++ and OpenMP have been used to perform parallel programming, since in parallel stages all threads must share certain variables, so the design architecture was shared memory.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[Corpus]]></kwd>
<kwd lng="en"><![CDATA[generative model]]></kwd>
<kwd lng="en"><![CDATA[Dirichlet distribution]]></kwd>
<kwd lng="en"><![CDATA[latent topics]]></kwd>
<kwd lng="en"><![CDATA[parallelization]]></kwd>
<kwd lng="en"><![CDATA[algorithm]]></kwd>
<kwd lng="en"><![CDATA[C/C++ programming]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Abella]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Medina]]></surname>
<given-names><![CDATA[J. E.]]></given-names>
</name>
</person-group>
<source><![CDATA[Segmentación lineal de texto por tópicos]]></source>
<year>2014</year>
<publisher-name><![CDATA[CENATAV]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B2">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Arbenz]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Petersen]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
</person-group>
<source><![CDATA[Introduction to Parallel Computing (Oxford Texts in Applied and Engineering Mathematics)]]></source>
<year>2004</year>
<publisher-loc><![CDATA[New York, NY, USA ]]></publisher-loc>
<publisher-name><![CDATA[Oxford University Press, Inc.]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B3">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Barney]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
</person-group>
<source><![CDATA[Openmp]]></source>
<year>2015</year>
</nlm-citation>
</ref>
<ref id="B4">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Barney]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
</person-group>
<source><![CDATA[Introduction to parallel computing tutorial]]></source>
<year>2017</year>
</nlm-citation>
</ref>
<ref id="B5">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Barney]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
</person-group>
<source><![CDATA[Message passing interface (mpi)]]></source>
<year>2017</year>
</nlm-citation>
</ref>
<ref id="B6">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bezanson]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[Julia micro-benchmarks]]></source>
<year>2017</year>
</nlm-citation>
</ref>
<ref id="B7">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bisgin]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Liu]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Kelly]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Fang]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Xu]]></surname>
<given-names><![CDATA[X.]]></given-names>
</name>
<name>
<surname><![CDATA[Tong]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Investigating drug repositioning opportunities in FDA drug labels through topic modeling]]></article-title>
<source><![CDATA[BMC Bioinformatics]]></source>
<year>2012</year>
<volume>13</volume>
<numero>S6</numero>
<issue>S6</issue>
</nlm-citation>
</ref>
<ref id="B8">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Blei]]></surname>
<given-names><![CDATA[D. M.]]></given-names>
</name>
<name>
<surname><![CDATA[Jordan]]></surname>
<given-names><![CDATA[M. I.]]></given-names>
</name>
</person-group>
<source><![CDATA[Modeling annotated data]]></source>
<year>2003</year>
<conf-name><![CDATA[ 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, SIGIR &#8217;03]]></conf-name>
<conf-loc>New York, NY, USA </conf-loc>
<page-range>127-34</page-range></nlm-citation>
</ref>
<ref id="B9">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Boleda]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
</person-group>
<source><![CDATA[GrAF version of Spanish portions of wikipedia corpus]]></source>
<year>2012</year>
</nlm-citation>
</ref>
<ref id="B10">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Dueñas]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Velásquez]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Una aplicación de web opinion mining para la extracción de tendencias y tópicos de relevancia a partir de las opiniones consignadas en blogs y sitios de noticias]]></article-title>
<source><![CDATA[Revista Ingenieria de Sistemas]]></source>
<year>2013</year>
<volume>27</volume>
<page-range>33-54</page-range></nlm-citation>
</ref>
<ref id="B11">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Gebali]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
</person-group>
<source><![CDATA[Algorithms and parallel computing]]></source>
<year>2011</year>
<publisher-loc><![CDATA[Hoboken, N.J. ]]></publisher-loc>
<publisher-name><![CDATA[Wiley]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B12">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Griffiths]]></surname>
<given-names><![CDATA[T. L.]]></given-names>
</name>
<name>
<surname><![CDATA[Steyvers]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A probabilistic approach to semantic representation]]></article-title>
<source><![CDATA[Proceedings of the Annual Meeting of the Cognitive Science Society]]></source>
<year>2002</year>
<volume>24</volume>
</nlm-citation>
</ref>
<ref id="B13">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Griffiths]]></surname>
<given-names><![CDATA[T. L.]]></given-names>
</name>
<name>
<surname><![CDATA[Steyvers]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Finding scientific topics]]></article-title>
<source><![CDATA[Proceedings of the National Academy of Sciences]]></source>
<year>2004</year>
<volume>101</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>5228-35</page-range></nlm-citation>
</ref>
<ref id="B14">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Heinrich]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Parameter estimation for text analysis]]></article-title>
<source><![CDATA[Technical report fraunhofer]]></source>
<year>2008</year>
<publisher-loc><![CDATA[Germany ]]></publisher-loc>
<publisher-name><![CDATA[University of Leipzig]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B15">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Hofmann]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Laskey]]></surname>
<given-names><![CDATA[K. B.]]></given-names>
</name>
<name>
<surname><![CDATA[Prade]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
</person-group>
<source><![CDATA[Probabilistic latent semantic analysis]]></source>
<year>1999</year>
<volume>15</volume>
<conf-name><![CDATA[ Fifteenth Conference on Uncertainty in Artificial Intelligence]]></conf-name>
<conf-loc> </conf-loc>
<page-range>289-96</page-range><publisher-loc><![CDATA[San Francisco, CA, USA ]]></publisher-loc>
<publisher-name><![CDATA[Morgan Kaufmann Publishers Inc.]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B16">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Hofmann]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Laskey]]></surname>
<given-names><![CDATA[K. B.]]></given-names>
</name>
<name>
<surname><![CDATA[Prade]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
</person-group>
<source><![CDATA[Probabilistic latent semantic indexing]]></source>
<year>1999</year>
<volume>15</volume>
<conf-name><![CDATA[ 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval]]></conf-name>
<conf-loc> </conf-loc>
<page-range>50-7</page-range><publisher-loc><![CDATA[New York, NY, USA ]]></publisher-loc>
<publisher-name><![CDATA[Association for Computing Machinery]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B17">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Hu]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Unsupervised learning of two bible books: Proverbs and psalms]]></article-title>
<source><![CDATA[Sociology Mind]]></source>
<year>2012</year>
<volume>02</volume>
<numero>03</numero>
<issue>03</issue>
<page-range>325-34</page-range></nlm-citation>
</ref>
<ref id="B18">
<nlm-citation citation-type="">
<collab>Laboratorio Nacional de Supercómputo (LNS)</collab>
<source><![CDATA[]]></source>
<year>2021</year>
</nlm-citation>
</ref>
<ref id="B19">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Landauer]]></surname>
<given-names><![CDATA[T. K.]]></given-names>
</name>
<name>
<surname><![CDATA[Foltz]]></surname>
<given-names><![CDATA[P. W.]]></given-names>
</name>
<name>
<surname><![CDATA[Laham]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[An introduction to latent semantic analysis]]></article-title>
<source><![CDATA[Discourse Processes]]></source>
<year>1998</year>
<volume>25</volume>
<numero>2-3</numero>
<issue>2-3</issue>
<page-range>259-84</page-range></nlm-citation>
</ref>
<ref id="B20">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Minka]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<source><![CDATA[Estimating a Dirichlet distribution]]></source>
<year>2000</year>
<publisher-name><![CDATA[MIT]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B21">
<nlm-citation citation-type="">
<collab>MPI Forum</collab>
<source><![CDATA[MPI documents]]></source>
<year>2017</year>
</nlm-citation>
</ref>
<ref id="B22">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Pacheco]]></surname>
<given-names><![CDATA[P. S.]]></given-names>
</name>
</person-group>
<source><![CDATA[An Introduction to Parallel Programming]]></source>
<year>2011</year>
<edition>1</edition>
<publisher-loc><![CDATA[Burlington, MA, USA ]]></publisher-loc>
<publisher-name><![CDATA[Morgan Kaufmann]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B23">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Reese]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Boleda]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
<name>
<surname><![CDATA[Cuadros]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Padró]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
<name>
<surname><![CDATA[Rigau]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
</person-group>
<source><![CDATA[Wikicorpus: A word-sense disambiguated multilingual wikipedia corpus]]></source>
<year>2010</year>
<conf-name><![CDATA[ Seventh International Conference on Language Resources and Evaluation, LREC&#8217;10]]></conf-name>
<conf-loc>Valletta, Malta </conf-loc>
</nlm-citation>
</ref>
<ref id="B24">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Rodríguez]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<source><![CDATA[Estudio de técnicas no supervisadas para descubrir tópicos en videos deportivos]]></source>
<year>2012</year>
<publisher-name><![CDATA[Universidad Jaume I]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B25">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ruiz]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Campos]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Clasificación de malformaciones craneales causadas por craneosinostosis primaria utilizando kernels no lineales]]></article-title>
<source><![CDATA[Revista Mexicana de Ingeniería Biomédica]]></source>
<year>2010</year>
<volume>31</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>15-29</page-range></nlm-citation>
</ref>
<ref id="B26">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Seiter]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Amft]]></surname>
<given-names><![CDATA[O.]]></given-names>
</name>
<name>
<surname><![CDATA[Rossi]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Tröster]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Discovery of activity composites using topic models: An analysis of unsupervised methods]]></article-title>
<source><![CDATA[Pervasive and Mobile Computing]]></source>
<year>2014</year>
<volume>15</volume>
<page-range>215-27</page-range></nlm-citation>
</ref>
<ref id="B27">
<nlm-citation citation-type="">
<collab>Software Intel</collab>
<source><![CDATA[OpenMP. pragmas and clauses summary]]></source>
<year>2017</year>
</nlm-citation>
</ref>
<ref id="B28">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
</person-group>
<source><![CDATA[Distributed gibbs sampling of latent topic models: The gritty details]]></source>
<year>2008</year>
</nlm-citation>
</ref>
</ref-list>
</back>
</article>
