<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1405-5546</journal-id>
<journal-title><![CDATA[Computación y Sistemas]]></journal-title>
<abbrev-journal-title><![CDATA[Comp. y Sist.]]></abbrev-journal-title>
<issn>1405-5546</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Investigación en Computación]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1405-55462006000100007</article-id>
<title-group>
<article-title xml:lang="es"><![CDATA[Algoritmos y Métodos para el Reconocimiento de Voz en Español Mediante Sílabas]]></article-title>
<article-title xml:lang="en"><![CDATA[Algorithms and Methods for the Automatic Speech Recognition in Spanish Language using Syllables]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Oropeza Rodríguez]]></surname>
<given-names><![CDATA[José Luis]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Suárez Guerra]]></surname>
<given-names><![CDATA[Sergio]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
</contrib-group>
<aff id="A01">
<institution><![CDATA[,IPN Centro de Investigación en Computación ]]></institution>
<addr-line><![CDATA[México D. F.]]></addr-line>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>03</month>
<year>2006</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>03</month>
<year>2006</year>
</pub-date>
<volume>9</volume>
<numero>3</numero>
<fpage>270</fpage>
<lpage>286</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1405-55462006000100007&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1405-55462006000100007&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1405-55462006000100007&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="es"><p><![CDATA[Actualmente el uso de los fonemas tiene implícita varias dificultades debido a que la identificación de las fronteras entre ellos por lo regular es difícil de encontrar en representaciones acústicas de voz. El presente trabajo plantea una alternativa a la forma en la que el reconocimiento de voz se ha estado implementando desde hace ya bastante tiempo, analizando la forma en la cual el paradigma de la sílaba responde a tal labor dentro del español. Durante los experimentos realizados fueron examinados para la tarea de segmentación tres elementos esenciales: a) la Función de Energía Total en Corto Tiempo, b) la Función de Energía de altas frecuencias Cepstrales (conocida como Energía del parámetro RO), y c) un Sistema Basado en Conocimiento. Tanto el Sistema Basado en Conocimiento y la Función de Energía Total en Corto Tiempo fueron usados en un corpus de dígitos en donde los resultados alcanzados usando sólo la Función de Energía Total en Corto Tiempo, fueron de 90.58%. Cuando se utilizaron los parámetros Función de Energía Total en Corto Tiempo y la Energía del parámetro RO se obtuvo un 94.70% de razón de reconocimiento. Lo cual causa un incremento del 5% con relación al uso de palabras completas en un corpus de voz dependiente de contexto. Por otro lado, cuando se utilizó un corpus de laboratorio del habla continua al usar la Función de Energía Total en Corto Tiempo y el Sistema Basado en Conocimiento, se alcanzó un 78.5% de razón de reconocimiento y un 80.5% de reconocimiento al usar los tres parámetros anteriores. El modelo del lenguaje utilizado para este caso fue el bigram y se utilizaron Cadenas Ocultas de Markov de densidad continua con tres y cinco estados, con 3 mixturas Gaussianas por estado.]]></p></abstract>
<abstract abstract-type="short" xml:lang="en"><p><![CDATA[This work examines the results of incorporating into Automatic Speech Recognition the syllable units for the Spanish language. Because of the boundaries between phonemes-like units its often difficult to elicit them; the use of these has not reached a good performance in Automatic Speech Recognition. In the course of the developing the experiments three approaches for the segmentation task were examined: a) the using of the Short Term Total Energy Function, b) the Energy Function of the Cepstral High Frequency (named ERO parameter), and c) a Knowledge Based System. They represent the most important contributions of this work; they showed good results for the Continuous and Discontinuous speech corpus developed in laboratory. The Knowledge Based System and Short Term Total Energy Function were used in a digit corpus where the results achieved using Short Term Total Energy Function alone reached 90.58% recognition rate. When Short Term Total Energy Function and RO parameters were used a 94.70% recognition rate was achieved. Otherwise, in the continuous speech corpus created in the laboratory the results achieved a 78.5% recognition rate using Short Term Total Energy Function and Knowledge Based System, and 80.5% recognition rate using the three approaches mentioned above. The bigram model language and Continuous Density Hidden Markov Models with three and five states incorporating three Gaussian Mixtures for state were implemented. By further including a major number of digital filters and Artificial Intelligent techniques in the training and recognition stages respectively the results can be improved even more. This research showed the potential of the syllabic unit paradigm for the Automatic Speech Recognition for the Spanish language. Finally, the inference rules in the Knowledge Based System associated with rules for splitting words in syllables in the cited language were created.]]></p></abstract>
<kwd-group>
<kwd lng="es"><![CDATA[Reconocimiento de voz]]></kwd>
<kwd lng="es"><![CDATA[reconocimiento de sílabas]]></kwd>
<kwd lng="es"><![CDATA[sistemas expertos]]></kwd>
<kwd lng="es"><![CDATA[procesamiento de voz]]></kwd>
<kwd lng="en"><![CDATA[Speech recognition]]></kwd>
<kwd lng="en"><![CDATA[Syllables recognition]]></kwd>
<kwd lng="en"><![CDATA[Expert System]]></kwd>
<kwd lng="en"><![CDATA[Speech processing]]></kwd>
</kwd-group>
</article-meta>
</front><body><![CDATA[ <p align="justify"><font face="verdana" size="4">Resumen de tesis doctoral</font></p>     <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>     <p align="center"><font face="verdana" size="4"><b>Algoritmos y M&eacute;todos para el Reconocimiento de Voz en Espa&ntilde;ol Mediante S&iacute;labas</b></font></p>     <p align="center"><font face="verdana" size="2">&nbsp;</font></p>     <p align="center"><font face="verdana" size="4"><i>Algorithms and Methods for the Automatic Speech Recognition in Spanish Language using </i><i>Syllables</i></font></p>     <p align="center"><font face="verdana" size="2">&nbsp;</font></p>     <p align="justify"><font face="verdana" size="2"><b>Graduated: Jos&eacute; Luis Oropeza Rodr&iacute;guez    <br> </b><i>Centro de Investigaci&oacute;n en Computaci&oacute;n&#150;IPN    <br> Av. Juan de Dios B&aacute;tiz s/n esq. Miguel Oth&oacute;n Mendiz&aacute;bal C. P. 07738 M&eacute;xico D. F.</i>    <br> <a href="mailto:j_orope@yahoo.com.mx">j_orope@yahoo.com.mx</a>    ]]></body>
<body><![CDATA[<br> <u>Graduado en diciembre 15, 2006</u></font></p>     <p align="justify"><font face="verdana" size="2"><b>Advisor: Sergio Su&aacute;rez Guerra    <br> </b><i>Centro de Investigaci&oacute;n en Computaci&oacute;n&#150;IPN    <br> Av. Juan de Dios B&aacute;tiz s/n esq. Miguel Oth&oacute;n Mendiz&aacute;bal C. P. 07738 M&eacute;xico D. F.</i>    <br> <a href="mailto:ssuare@cic.ipn.mx">ssuare@cic.ipn.mx</a></font></p>     <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>     <p align="justify"><font face="verdana" size="2"><b>Resumen</b></font></p>     <p align="justify"><font face="verdana" size="2">Actualmente el uso de los fonemas tiene impl&iacute;cita varias dificultades debido a que la identificaci&oacute;n de las fronteras entre ellos por lo regular es dif&iacute;cil de encontrar en representaciones ac&uacute;sticas de voz. El presente trabajo plantea una alternativa a la forma en la que el reconocimiento de voz se ha estado implementando desde hace ya bastante tiempo, analizando la forma en la cual el paradigma de la s&iacute;laba responde a tal labor dentro del espa&ntilde;ol. Durante los experimentos realizados fueron examinados para la tarea de segmentaci&oacute;n tres elementos esenciales: a) la Funci&oacute;n de Energ&iacute;a Total en Corto Tiempo, b) la Funci&oacute;n de Energ&iacute;a de altas frecuencias Cepstrales (conocida como Energ&iacute;a del par&aacute;metro RO), y c) un Sistema Basado en Conocimiento. Tanto el Sistema Basado en Conocimiento y la Funci&oacute;n de Energ&iacute;a Total en Corto Tiempo fueron usados en un corpus de d&iacute;gitos en donde los resultados alcanzados usando s&oacute;lo la Funci&oacute;n de Energ&iacute;a Total en Corto Tiempo, fueron de 90.58%. Cuando se utilizaron los par&aacute;metros Funci&oacute;n de Energ&iacute;a Total en Corto Tiempo y la Energ&iacute;a del par&aacute;metro RO se obtuvo un 94.70% de raz&oacute;n de reconocimiento. Lo cual causa un incremento del 5% con relaci&oacute;n al uso de palabras completas en un corpus de voz dependiente de contexto. Por otro lado, cuando se utiliz&oacute; un corpus de laboratorio del habla continua al usar la Funci&oacute;n de Energ&iacute;a Total en Corto Tiempo y el Sistema Basado en Conocimiento, se alcanz&oacute; un 78.5% de raz&oacute;n de reconocimiento y un 80.5% de reconocimiento al usar los tres par&aacute;metros anteriores. El modelo del lenguaje utilizado para este caso fue el bigram y se utilizaron Cadenas Ocultas de Markov de densidad continua con tres y cinco estados, con 3 mixturas Gaussianas por estado.</font></p>     <p align="justify"><font face="verdana" size="2"><b>Palabras clave: </b>Reconocimiento de voz, reconocimiento de s&iacute;labas, sistemas expertos, procesamiento de voz.</font></p>     <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>     ]]></body>
<body><![CDATA[<p align="justify"><font face="verdana" size="2"><b>Abstract</b></font></p>     <p align="justify"><font face="verdana" size="2">This work examines the results of incorporating into Automatic Speech Recognition the syllable units for the Spanish language. Because of the boundaries between phonemes&#150;like units its often difficult to elicit them; the use of these has not reached a good performance in Automatic Speech Recognition. In the course of the developing the experiments three approaches for the segmentation task were examined: a) the using of the Short Term Total Energy Function, b) the Energy Function of the Cepstral High Frequency (named ERO parameter), and c) a Knowledge Based System. They represent the most important contributions of this work; they showed good results for the Continuous and Discontinuous speech corpus developed in laboratory.</font></p>     <p align="justify"><font face="verdana" size="2">The Knowledge Based System and Short Term Total Energy Function were used in a digit corpus where the results achieved using Short Term Total Energy Function alone reached 90.58% recognition rate. When Short Term Total Energy Function and RO parameters were used a 94.70% recognition rate was achieved. Otherwise, in the continuous speech corpus created in the laboratory the results achieved a 78.5% recognition rate using Short Term Total Energy Function and Knowledge Based System, and 80.5% recognition rate using the three approaches mentioned above. The bigram model language and Continuous Density Hidden Markov Models with three and five states incorporating three Gaussian Mixtures for state were implemented.</font></p>     <p align="justify"><font face="verdana" size="2">By further including a major number of digital filters and Artificial Intelligent techniques in the training and recognition stages respectively the results can be improved even more. This research showed the potential of the syllabic unit paradigm for the Automatic Speech Recognition for the Spanish language. Finally, the inference rules in the Knowledge Based System associated with rules for splitting words in syllables in the cited language were created.</font></p>     <p align="justify"><font face="verdana" size="2"><b>Keywords: </b>Speech recognition, Syllables recognition, Expert System, Speech processing.</font></p>     <p align="justify">&nbsp;</p>     <p align="justify"><font face="verdana" size="2"><a href="/pdf/cys/v9n3/v9n3a7.pdf" target="_blank">DESCARGA ARTICULO EN FORMATO PDF</a></font></p>     <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>     <p align="justify"><font face="verdana" size="2"><b>Referencias</b></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2">1. Feal (2000).   Feal L., "Sobre el uso de la s&iacute;laba como unidad de s&iacute;ntesis en el espa&ntilde;ol", Informe T&eacute;cnico, Departamento de Inform&aacute;tica, Universidad de Valladolid, 2000.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040314&pid=S1405-5546200600010000700001&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">2. Fosler et al. (1999). Fosler&#150;Lussier E., Greenberg S., Morgan N., "Incorporating Contextual Phonetics into Automatic Speech Recognition". XIV International Congress of Phonetic Sciences, pp. 611&#150;614, San Francisco, 1999.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040315&pid=S1405-5546200600010000700002&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">3. Giarratano and Riley (2001). Giarratano Joseph y Riley Gary, International Thompson Editores,    Sistemas expertos, principios y programaci&oacute;n 2001.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040316&pid=S1405-5546200600010000700003&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">4. Hauenstein (1996). Hauenstein A., "The syllable Re&#150;revisited", Technical Report, Siemens AG, Corporate Research and Development, M&uuml;nchen Alemania, 1996.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040317&pid=S1405-5546200600010000700004&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">5. Jackson (1986). Jackson L. B. "Digital Filters and Signal Processing". Kluwer Academic Publishers. University of Louisville, Department of Electrical and Computer Engineering, U.S.A., 1986</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040318&pid=S1405-5546200600010000700005&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">6. Jones et al. (1999). Jones R., Downey S., Mason J., "Continuous Speech Recognition Using Syllables", Proceedings of Eurospeech, Vol. 3, pp. 1171&#150;1174, Rhodes, Grecia 1999.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040319&pid=S1405-5546200600010000700006&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">7. Kamakshi et al.  (2002). Kamakshi V. Prasad, Nagarajan T. and Murthy Hema A.. "Continuous Speech Recognition Using Automatically Segmented Data at Syllabic Units". Department of Computer Science and Engineering. Indian Institute of Technology, Madras, Chennai 600&#150;036. 2002.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040320&pid=S1405-5546200600010000700007&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">8. Kirschning (1998). Kirschning Albers Ingrid, "Automatic Speech Recognition with the Parallel Cascade Neural Network", PhD Thesis, Tokyo Japan, March 1998.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040321&pid=S1405-5546200600010000700008&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">9. Kosko (1992). Kosko B., "Neural Networks for Signal Processing", Prentice Hall, U.S.A., 1992.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040322&pid=S1405-5546200600010000700009&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">10. Meneido et al. (1999). Meneido Hugo, Neto Jo&acirc;o P. and Almeida Lu&iacute;s B., INESC&#150;IST, "Syllable Onset Detection Applied to the Portuguese Language". Sixth European Conference on Speech Communication and Technology (EUROSPEECH'99) Budapest, Hunagry, September 5&#150;9, 1999.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040323&pid=S1405-5546200600010000700010&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">11. Meneido and Neto (2000). Meneido H., Neto J., "Combination of Acoustic Models in Continuous Speech Recognition Hybrid Systems". INESC, Rua Alves Redol, 9, 1000&#150;029 Lisbon, Portugal, 2000.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040324&pid=S1405-5546200600010000700011&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">12. Mermelstein (1975). Mermelstein Paul "Automatic Segmentation of Speech into Syllabic Units". Haskins Laboratories, New Haven, Connecticut 06510, pp. 880&#150;883,58 (4), June 1975.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040325&pid=S1405-5546200600010000700012&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">13. Oropeza (2000). Oropeza Rodr&iacute;guez Jos&eacute; Luis, "Reconocimiento de Comandos Verbales usando HMM". Tesis de maestr&iacute;a, Centro de Investigaci&oacute;n en Computaci&oacute;n, Noviembre 2000.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040326&pid=S1405-5546200600010000700013&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">14. Rabiner and Biing&#150;Hwang (1993). Lawrence Rabiner and Biing&#150;Hwang Juang, "Fundamentals of Speech Recognition", Prentice Hall, 1993.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040327&pid=S1405-5546200600010000700014&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">15. Resch (2001a). Resch Barbara. "Gaussian Statistics and Unsupervised Learning". A tutorial for the Course Computational      Intelligence      Signal      Processing      and      Speech      Communication      Laboratory. <a href="http://www.igi.tugraz.at/lehre/CI/" target="_blank">www.igi.tugraz.at/lehre/CI</a>, November 15, 2001.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040328&pid=S1405-5546200600010000700015&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">16. Resch (2001b). Resch Barbara. "Hidden Markov Models". A Tutorial for the Course Computational Laboratory. Signal Processing and Speech Communication Laboratory. <a href="http://www.igi.tugraz.at/lehre/CI/" target="_blank">www.igi.turgaz.at/lehre/CI</a>, November 15, 2001.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040329&pid=S1405-5546200600010000700016&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">17. Russell and Norvig (1996). Russell Stuart and Norvig Peter, Inteligencia Artificial un enfoque moderno, Prentice Hall, 1996.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040330&pid=S1405-5546200600010000700017&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">18. Savage (1995). Savage Carmona Jesus, "A Hybrid Systems with Symbolic AI and Statistical Methods for Speech Recognition". PhD Thesis, University of Washington, 1995.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040331&pid=S1405-5546200600010000700018&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">19. Su&aacute;rez (2005). Su&aacute;rez Guerra Sergio, &iquest;100% de reconocimiento de voz?. Trabajo in&eacute;dito, no publicado.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040332&pid=S1405-5546200600010000700019&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">20. Sydral et al. (1995). Sydral A., Bennet R., Greenspan S., "Applied Speech Technology", Eds (1995). CRC Press, ISBN 0&#150;8493&#150;9456&#150;2, U.S.A., 1995.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040333&pid=S1405-5546200600010000700020&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">21. Weber (2000). Weber K., "Multiple Timescale Feature Combination Towards Robust Speech Recognition". Konferenz zur Verarbeitung nat&uuml;rlicher Sprache KOVENS2000, Ilmenau, Alemania, 2000.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040334&pid=S1405-5546200600010000700021&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">22. Wu (1998).     Wu,  S., "Incorporating information from syllable&#150;length time scales into automatic speech recognition", PhD Thesis, Berkeley University, 1998.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040335&pid=S1405-5546200600010000700022&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">23. Wu et al. (1997). Wu S., Shire M., Greenberg S., Morgan N., "Integrating Syllable Boundary Information into Automatic Speech Recognition ". ICASSP&#150;97, Vol. 1, Munich Germany, vol.2 pp. 987&#150;990, 1997.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040336&pid=S1405-5546200600010000700023&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2">24. Zhang (1999). Zhang Jialu, "On the syllable structures of Chinese relating to speech recognition", Institute of Acoustics, Academia Sinica Beijing, China, 1999.</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2040337&pid=S1405-5546200600010000700024&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --> ]]></body><back>
<ref-list>
<ref id="B1">
<label>1</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Feal]]></surname>
<given-names><![CDATA[L]]></given-names>
</name>
</person-group>
<source><![CDATA[Sobre el uso de la sílaba como unidad de síntesis en el español: Informe Técnico]]></source>
<year>2000</year>
<publisher-name><![CDATA[Departamento de Informática, Universidad de Valladolid]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B2">
<label>2</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Fosler-Lussier]]></surname>
<given-names><![CDATA[E]]></given-names>
</name>
<name>
<surname><![CDATA[Greenberg]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Morgan]]></surname>
<given-names><![CDATA[N]]></given-names>
</name>
</person-group>
<source><![CDATA[Incorporating Contextual Phonetics into Automatic Speech Recognition]]></source>
<year>1999</year>
<conf-name><![CDATA[ XIV International Congress of Phonetic Sciences]]></conf-name>
<conf-loc> </conf-loc>
<page-range>611-614</page-range><publisher-loc><![CDATA[San Francisco ]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B3">
<label>3</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Giarratano]]></surname>
<given-names><![CDATA[Joseph]]></given-names>
</name>
<name>
<surname><![CDATA[Riley]]></surname>
<given-names><![CDATA[Gary]]></given-names>
</name>
</person-group>
<source><![CDATA[Sistemas expertos, principios y programación]]></source>
<year>2001</year>
<publisher-name><![CDATA[International Thompson Editores]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B4">
<label>4</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Hauenstein]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
</person-group>
<source><![CDATA[The syllable Re-revisited: Technical Report]]></source>
<year>1996</year>
<publisher-loc><![CDATA[München ]]></publisher-loc>
<publisher-name><![CDATA[Siemens AGCorporate Research and Development]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B5">
<label>5</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Jackson]]></surname>
<given-names><![CDATA[L. B]]></given-names>
</name>
</person-group>
<source><![CDATA[Digital Filters and Signal Processing]]></source>
<year>1986</year>
<publisher-name><![CDATA[Kluwer Academic PublishersUniversity of Louisville, Department of Electrical and Computer Engineering]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B6">
<label>6</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Jones]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
<name>
<surname><![CDATA[Downey]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Mason]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
</person-group>
<source><![CDATA[Continuous Speech Recognition Using Syllables: Proceedings of Eurospeech]]></source>
<year>1999</year>
<volume>3</volume>
<page-range>1171-1174</page-range><publisher-loc><![CDATA[Rhodes ]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B7">
<label>7</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Prasad]]></surname>
<given-names><![CDATA[Kamakshi V]]></given-names>
</name>
<name>
<surname><![CDATA[Nagarajan]]></surname>
<given-names><![CDATA[T]]></given-names>
</name>
<name>
<surname><![CDATA[Murthy Hema]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
</person-group>
<source><![CDATA[Continuous Speech Recognition Using Automatically Segmented Data at Syllabic Units]]></source>
<year>2002</year>
<publisher-loc><![CDATA[Madras ]]></publisher-loc>
<publisher-name><![CDATA[Department of Computer Science and Engineering. Indian Institute of Technology]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B8">
<label>8</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kirschning Albers]]></surname>
<given-names><![CDATA[Ingrid]]></given-names>
</name>
</person-group>
<source><![CDATA[Automatic Speech Recognition with the Parallel Cascade Neural Network]]></source>
<year>Marc</year>
<month>h </month>
<day>19</day>
<publisher-loc><![CDATA[Tokyo ]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B9">
<label>9</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kosko]]></surname>
<given-names><![CDATA[B]]></given-names>
</name>
</person-group>
<source><![CDATA[Neural Networks for Signal Processing]]></source>
<year>1992</year>
<publisher-name><![CDATA[Prentice Hall]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B10">
<label>10</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Meneido]]></surname>
<given-names><![CDATA[Hugo]]></given-names>
</name>
<name>
<surname><![CDATA[Neto Joâo]]></surname>
<given-names><![CDATA[P]]></given-names>
</name>
<name>
<surname><![CDATA[Almeida]]></surname>
<given-names><![CDATA[Luís B]]></given-names>
</name>
</person-group>
<source><![CDATA[Syllable Onset Detection Applied to the Portuguese Language]]></source>
<year>1999</year>
<conf-name><![CDATA[ Sixth European Conference on Speech Communication and Technology (EUROSPEECH'99)]]></conf-name>
<conf-loc> </conf-loc>
<publisher-loc><![CDATA[Budapest ]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B11">
<label>11</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Meneido]]></surname>
<given-names><![CDATA[H]]></given-names>
</name>
<name>
<surname><![CDATA[Neto]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
</person-group>
<source><![CDATA[Combination of Acoustic Models in Continuous Speech Recognition Hybrid Systems]]></source>
<year>2000</year>
<volume>9</volume>
<page-range>1000-029</page-range><publisher-loc><![CDATA[Lisbon ]]></publisher-loc>
<publisher-name><![CDATA[INESCRua Alves Redol]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B12">
<label>12</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Mermelstein]]></surname>
<given-names><![CDATA[Paul]]></given-names>
</name>
</person-group>
<source><![CDATA[Automatic Segmentation of Speech into Syllabic Units]]></source>
<year>June</year>
<month> 1</month>
<day>97</day>
<volume>58</volume><volume>4</volume>
<page-range>880-883</page-range><publisher-loc><![CDATA[New Haven ]]></publisher-loc>
<publisher-name><![CDATA[Haskins Laboratories]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B13">
<label>13</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Oropeza Rodríguez]]></surname>
<given-names><![CDATA[José Luis]]></given-names>
</name>
</person-group>
<source><![CDATA[Reconocimiento de Comandos Verbales usando HMM]]></source>
<year></year>
</nlm-citation>
</ref>
<ref id="B14">
<label>14</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Lawrence]]></surname>
<given-names><![CDATA[Rabiner]]></given-names>
</name>
<name>
<surname><![CDATA[Biing-Hwang]]></surname>
<given-names><![CDATA[Juang]]></given-names>
</name>
</person-group>
<source><![CDATA[Fundamentals of Speech Recognition]]></source>
<year>1993</year>
<publisher-name><![CDATA[Prentice Hall]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B15">
<label>15</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Resch]]></surname>
<given-names><![CDATA[Barbara]]></given-names>
</name>
</person-group>
<source><![CDATA[Gaussian Statistics and Unsupervised Learning: A tutorial for the Course Computational Intelligence Signal Processing and Speech Communication Laboratory]]></source>
<year>2001</year>
</nlm-citation>
</ref>
<ref id="B16">
<label>16</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Resch]]></surname>
<given-names><![CDATA[Barbara]]></given-names>
</name>
</person-group>
<source><![CDATA[Hidden Markov Models: A Tutorial for the Course Computational Laboratory. Signal Processing and Speech Communication Laboratory]]></source>
<year>2001</year>
</nlm-citation>
</ref>
<ref id="B17">
<label>17</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Russell]]></surname>
<given-names><![CDATA[Stuart]]></given-names>
</name>
<name>
<surname><![CDATA[Norvig]]></surname>
<given-names><![CDATA[Peter]]></given-names>
</name>
</person-group>
<source><![CDATA[Inteligencia Artificial un enfoque moderno]]></source>
<year>1996</year>
<publisher-name><![CDATA[Prentice Hall]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B18">
<label>18</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Savage Carmona]]></surname>
<given-names><![CDATA[Jesus]]></given-names>
</name>
</person-group>
<source><![CDATA[A Hybrid Systems with Symbolic AI and Statistical Methods for Speech Recognition]]></source>
<year>1995</year>
<publisher-name><![CDATA[University of Washington]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B19">
<label>19</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Suárez Guerra]]></surname>
<given-names><![CDATA[Sergio]]></given-names>
</name>
</person-group>
<source><![CDATA[¿100% de reconocimiento de voz?]]></source>
<year></year>
</nlm-citation>
</ref>
<ref id="B20">
<label>20</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sydral]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Bennet]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
<name>
<surname><![CDATA[Greenspan]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
</person-group>
<source><![CDATA[Applied Speech Technology]]></source>
<year>1995</year>
<publisher-name><![CDATA[CRC Press]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B21">
<label>21</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Weber]]></surname>
<given-names><![CDATA[K]]></given-names>
</name>
</person-group>
<source><![CDATA[Multiple Timescale Feature Combination Towards Robust Speech Recognition: Konferenz zur Verarbeitung natürlicher Sprache KOVENS2000]]></source>
<year>2000</year>
<publisher-loc><![CDATA[Ilmenau ]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B22">
<label>22</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Wu]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
</person-group>
<source><![CDATA[Incorporating information from syllable-length time scales into automatic speech recognition]]></source>
<year>1998</year>
<publisher-name><![CDATA[Berkeley University]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B23">
<label>23</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Wu]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Shire]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
<name>
<surname><![CDATA[Greenberg]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
<name>
<surname><![CDATA[Morgan]]></surname>
<given-names><![CDATA[N]]></given-names>
</name>
</person-group>
<source><![CDATA[Integrating Syllable Boundary Information into Automatic Speech Recognition]]></source>
<year>1997</year>
<volume>1</volume><volume>2</volume>
<page-range>987-990</page-range><publisher-loc><![CDATA[Munich ]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B24">
<label>24</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[Jialu]]></given-names>
</name>
</person-group>
<source><![CDATA[On the syllable structures of Chinese relating to speech recognition]]></source>
<year>1999</year>
<publisher-loc><![CDATA[Beijing ]]></publisher-loc>
<publisher-name><![CDATA[Institute of AcousticsAcademia Sinica]]></publisher-name>
</nlm-citation>
</ref>
</ref-list>
</back>
</article>
