<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1405-5546</journal-id>
<journal-title><![CDATA[Computación y Sistemas]]></journal-title>
<abbrev-journal-title><![CDATA[Comp. y Sist.]]></abbrev-journal-title>
<issn>1405-5546</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Investigación en Computación]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1405-55462020000200861</article-id>
<article-id pub-id-type="doi">10.13053/cys-24-2-3229</article-id>
<title-group>
<article-title xml:lang="es"><![CDATA[Etiquetado fonético automático al nivel palabra usando la dinámica de cambio de los vectores del libro código]]></article-title>
<article-title xml:lang="en"><![CDATA[Automatic Phonetic Labeling at Word Level Using the Dynamics of Changing Codebook Vectors]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Suárez Guerra]]></surname>
<given-names><![CDATA[Sergio]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Oropeza Rodríguez]]></surname>
<given-names><![CDATA[José Luis]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,Instituto Politécnico Nacional Centro de Investigación en Computación ]]></institution>
<addr-line><![CDATA[ ]]></addr-line>
<country>Mexico</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>06</month>
<year>2020</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>06</month>
<year>2020</year>
</pub-date>
<volume>24</volume>
<numero>2</numero>
<fpage>861</fpage>
<lpage>874</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1405-55462020000200861&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1405-55462020000200861&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1405-55462020000200861&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="es"><p><![CDATA[Resumen: Se describe una solución alternativa referente al etiquetado fonético que componen un conjunto de palabras de pronunciadas por un locutor, susceptible de utilizarse en cualquier idioma, según sean las necesidades y características asociadas a la propuesta. El procedimiento se basa en el seguimiento de la dinámica de cambio de los vectores cepstrales asociados a la frecuencia de Mel (MFCCs) que conforman el Libro Código (LC), extraído de la palabra a etiquetar. Esta dinámica de cambio analiza dónde ocurre una transición de un vector (MFCC) del LC a otro, así como las perturbaciones que ocurren en la zona de cambio debido a la concatenación fonética. Se establecen métricas para considerar el ruido de coarticulación y definir la ubicación de la frontera de separación fonética. Se usan dos métodos para evaluar la dinámica de cambio de los vectores y entregar el etiquetado más acertado. El porciento de reconocimiento y etiquetado correcto obtenido con esta aplicación es del 97.9%, inferior en un 1.06%, con respecto al porcentaje de reconocimiento obtenido sobre el mismo corpus de palabras, pero haciendo uso de un etiquetado manual. Lo más impórtate es que, el tiempo utilizado en el etiquetado del corpus de voz de forma automática, es significativamente menor que el estimado de hacerse manualmente, además de eliminar la subjetividad personal en el trabajo de etiquetado.]]></p></abstract>
<abstract abstract-type="short" xml:lang="en"><p><![CDATA[Abstract: An alternative solution is described regarding the phonetic labeling that compose a set of pronounced by an announcer, susceptible of being used in any language, according to the needs and characteristics associated with the proposal. The procedure is based on the monitoring of the dynamics of change of the cepstral vectors associated with the frequency of Mel (MFCCs) that make up the Book Code (LC), extracted from the word to be labeled. This dynamics of change analyzes where a transition from one vector (MFCC) of the LC occurs to another, as well as the disturbances that occur in the zone of change due to the phonetic concatenation. Metrics are established to consider coarticulation noise and define the location of the phonetic separation boundary. Two methods are used to evaluate the dynamics of vector change and deliver the most accurate labeling. The percentage of recognition and correct labeling obtained with this application is 97.9% lower by 1.06%, with respect to the percentage of recognition obtained on the same corpus of words, but using manual labeling. The more important are that, the time used in the labeling of the voice corpus automatically is significantly less than the estimate of being done manually, in addition to eliminating personal subjectivity in the labeling work.]]></p></abstract>
<kwd-group>
<kwd lng="es"><![CDATA[Etiquetado fonético]]></kwd>
<kwd lng="es"><![CDATA[reconocimiento de voz]]></kwd>
<kwd lng="en"><![CDATA[Phonetic labeling]]></kwd>
<kwd lng="en"><![CDATA[voice recognition]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<label>1</label><nlm-citation citation-type="book">
<collab>UPV</collab>
<source><![CDATA[Desarrollo de un sistema de reconocimiento automático del habla]]></source>
<year>2010</year>
<page-range>8-9</page-range><publisher-name><![CDATA[Universidad Politécnica de Valencia, Escuela Superior de Ingeniería Informática]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B2">
<label>2</label><nlm-citation citation-type="book">
<collab>UPC</collab>
<source><![CDATA[Técnicas de procesado y representación de la señal de voz para el reconocimiento del habla en ambientes ruidos]]></source>
<year>1993</year>
<publisher-name><![CDATA[Universidad politécnica de Cataluña, Departamento. de teoría de la señal y comunicaciones]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B3">
<label>3</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Galka]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Ziolko]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Wavelets in Speech Segmentation]]></source>
<year>2008</year>
<numero>5-7</numero>
<conf-name><![CDATA[ Electrotechnical Conference, MELECON´08, The 14th IEEE Mediterranean]]></conf-name>
<conf-loc> </conf-loc>
<issue>5-7</issue>
<page-range>876-9</page-range></nlm-citation>
</ref>
<ref id="B4">
<label>4</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ziolko]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Manandhar]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Wilson]]></surname>
<given-names><![CDATA[R.C.]]></given-names>
</name>
</person-group>
<source><![CDATA[Phoneme Segmentation of Speech. Pattern Recognition]]></source>
<year>2006</year>
<volume>40</volume>
<numero>307</numero>
<conf-name><![CDATA[ Conference on ICPR´06, 18th International]]></conf-name>
<conf-loc> </conf-loc>
<issue>307</issue>
<page-range>282-5</page-range></nlm-citation>
</ref>
<ref id="B5">
<label>5</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Hosom]]></surname>
<given-names><![CDATA[J.P.]]></given-names>
</name>
</person-group>
<source><![CDATA[Automatic Time Alignment of Phonemes using Acoustic-Phonetic Information]]></source>
<year>2000</year>
<publisher-name><![CDATA[Institute of Science and Technology]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B6">
<label>6</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bansal]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Pradhanet]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Arora]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Speech Synthesis &#8211; Automatic Segmentation]]></article-title>
<source><![CDATA[International Journal of Computer Applications]]></source>
<year>2014</year>
<volume>98</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>29-31</page-range></nlm-citation>
</ref>
<ref id="B7">
<label>7</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Toledano]]></surname>
<given-names><![CDATA[D.T.]]></given-names>
</name>
<name>
<surname><![CDATA[Gómez]]></surname>
<given-names><![CDATA[L.A.H.]]></given-names>
</name>
<name>
<surname><![CDATA[Grande]]></surname>
<given-names><![CDATA[L.V.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Automatic Phonetic Segmentation]]></article-title>
<source><![CDATA[Speech and Audio Processing, IEEE Transactions on]]></source>
<year>2003</year>
<volume>11</volume>
<numero>6</numero>
<issue>6</issue>
<page-range>617-25</page-range></nlm-citation>
</ref>
<ref id="B8">
<label>8</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Davis]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Mermelstein]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences]]></article-title>
<source><![CDATA[IEEE Transactions on Acoustics, Speech, and Signal Processing]]></source>
<year>1980</year>
<volume>28</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>357-66</page-range></nlm-citation>
</ref>
<ref id="B9">
<label>9</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Hernández-Mena]]></surname>
<given-names><![CDATA[C.D.]]></given-names>
</name>
<name>
<surname><![CDATA[Herrera-Camacho]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<source><![CDATA[CIEMPIESS: A New Open-Sourced Mexican Spanish Radio Corpus]]></source>
<year>2017</year>
<publisher-name><![CDATA[Departamento de Procesamiento Digital de Señales. Universidad Nacional Autónoma de México (UNAM)]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B10">
<label>10</label><nlm-citation citation-type="confpro">
<collab>LNCS</collab>
<source><![CDATA[Automatic Phoneme Border Detection improves Speech Recognition]]></source>
<year>2015</year>
<conf-name><![CDATA[ MICAI 2015. LNAI 9413]]></conf-name>
<conf-loc> </conf-loc>
<page-range>127-38</page-range></nlm-citation>
</ref>
<ref id="B11">
<label>11</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Quilis]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<source><![CDATA[Tratado de Fonología y Fonética Españolas]]></source>
<year>1999</year>
<publisher-name><![CDATA[Gredos]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B12">
<label>12</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Linde]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Buzo]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Gray]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[An Algorithm for Vector Quantizer Design]]></article-title>
<source><![CDATA[IEEE Transactions on Communications]]></source>
<year>1980</year>
<volume>28</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>84-95</page-range></nlm-citation>
</ref>
<ref id="B13">
<label>13</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Rabiner]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
<name>
<surname><![CDATA[Biing-Hwang]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[Fundamentals of Speech Recognition]]></source>
<year>1993</year>
<publisher-name><![CDATA[Prentice Hall]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B14">
<label>14</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Young]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<source><![CDATA[The HTK Book]]></source>
<year>2006</year>
<publisher-name><![CDATA[Cambridge]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B15">
<label>15</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Young]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<source><![CDATA[The HTK Toolkit]]></source>
<year>2006</year>
</nlm-citation>
</ref>
</ref-list>
</back>
</article>
