<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1405-5546</journal-id>
<journal-title><![CDATA[Computación y Sistemas]]></journal-title>
<abbrev-journal-title><![CDATA[Comp. y Sist.]]></abbrev-journal-title>
<issn>1405-5546</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Investigación en Computación]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1405-55462011000300003</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[Speaker Verification in Different Database Scenarios]]></article-title>
<article-title xml:lang="es"><![CDATA[Verificación de hablante en diferentes escenarios de base de datos]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[García Perera]]></surname>
<given-names><![CDATA[Leibny Paola]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Aceves López]]></surname>
<given-names><![CDATA[Roberto]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Nolazco Flores]]></surname>
<given-names><![CDATA[Juan]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
</contrib-group>
<aff id="A01">
<institution><![CDATA[,Tecnológico de Monterrey Departamento de Ciencias Computacionales ]]></institution>
<addr-line><![CDATA[Monterrey Nuevo León]]></addr-line>
<country>México</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>09</month>
<year>2011</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>09</month>
<year>2011</year>
</pub-date>
<volume>15</volume>
<numero>1</numero>
<fpage>17</fpage>
<lpage>26</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1405-55462011000300003&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1405-55462011000300003&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1405-55462011000300003&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[This document shows the results of our Speaker Verification System under two scenarios: the Face and Speaker Verification Evaluation organized by MOBIO (MObile BIOmetric consortium) and the results for the Speaker Recognition Evaluation 2010 organized by NIST. The core of our system is based on a Gaussian Mixture Model (GMM) and maximum likelihood (ML) framework. First, it extracts the important speech features by computing the Mel Frequency Cepstral Coefficients (MFCC). Then, the MFCCs train gender-dependent GMMs that are later adapted to obtain target models. To obtain reliable performance statistics those target-models evaluate a set of trials and final scores are calculated. Finally, those scores are tagged as target or impostor. We tried several system configurations and found that each database requires a specific tuning to improve the performance. For the MOBIO database we obtained an average equal error rate (EER) of 16.43 %. For the NIST 2010 database we accomplished an average EER of 16.61%. NIST2010 database considers various conditions. From those conditions, the interview training and testing conditions showed the best EER of 10.94 %, followed by the phone call training phone call testing conditions of 13.35%.]]></p></abstract>
<abstract abstract-type="short" xml:lang="es"><p><![CDATA[Este documento muestra los resultados de nuestro sistema de verificación de hablante bajo dos escenarios: la Evaluación Face and Speaker Verification Evaluation organizada por MOBIO (MObile BIOmetric consortium) y la Evaluación de Reconociemiento de personas 2010 organizada por NIST. La parte central de nuestro esquema se basa en un modelado de Mezclas de Gaussianas (GMM) y máxima verosimilitud. Primero, se extraen los parámetros importantes de la voz calculando los coeficientes ceptrales en escala mel, Mel Frequency Cepstral Coefficients (MFCC). Después, dichos MFFCs entrenan las mezclas de Gaussianas dependientes del género que posteriormente serán adaptadas y se obtendrán los modelos de los usuarios objetivo. Para obtener estadísticas confiables esos modelos objetivo son evaluados por un conjunto de señales no conocidas y se obtienen puntuaciones finales. Por último, esas puntuaciones son etiquetadas como usuario objetivo o impostor. Hemos analizado diferentes configuraciones y encontramos que cada base de datos requiere una sintonización adecuada para mejorar su desempeño. Para la base de datos MOBIO, obtuvimos un porcentaje de error promedio de 16.43 %. Para la base de datos NIST2010, logramos un promedio de error de 16.61%. La base de datos NIST2010 considera varias condiciones. De esas condiciones, la condición de entrevista para entrenamiento y prueba mostró el mejor error con 10.94 %, seguida por la condición de llamada telefónica en entrenamiento y llamada telefónica en prueba con 13.35%.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[Speaker verification and authentication]]></kwd>
<kwd lng="es"><![CDATA[Verificación de hablante y autenticación]]></kwd>
</kwd-group>
</article-meta>
</front><body><![CDATA[ <p align="justify"><font face="verdana" size="4">Art&iacute;culos</font></p>     <p align="justify"><font face="verdana" size="4">&nbsp;</font></p>     <p align="center"><font face="verdana" size="4"><b>Speaker Verification in Different Database Scenarios</b></font></p>     <p align="center"><font face="verdana" size="2">&nbsp;</font></p>     <p align="center"><font face="verdana" size="3"><b>Verificaci&oacute;n de hablante en diferentes escenarios de base de datos</b></font></p>     <p align="center"><font face="verdana" size="2">&nbsp;</font></p>     <p align="center"><font face="verdana" size="2"><b>Leibny Paola Garc&iacute;a Perera, Roberto Aceves L&oacute;pez, and Juan Nolazco Flores</b></font></p>     <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>     <p align="justify"><font face="verdana" size="2"><i>Departamento de Ciencias Computacionales, Tecnol&oacute;gico de Monterrey, Monterrey, Nuevo Le&oacute;n, M&eacute;xico. E&#150;mail:</i> <a href="mailto:paola.garcia@itesm.mx">paola.garcia@itesm.mx</a>, <a href="mailto:aceves@itesm.mx">aceves@itesm.mx</a>, <a href="mailto:jnolazco@itesm.mx">jnolazco@itesm.mx</a></font></p>     <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>     ]]></body>
<body><![CDATA[<p align="justify"><font face="verdana" size="2">Article received on July 30, 2010.    <br> Accepted on January 15, 2011.</font></p>     <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>     <p align="justify"><font face="verdana" size="2"><b>Abstract</b></font></p>     <p align="justify"><font face="verdana" size="2">This document shows the results of our Speaker Verification System under two scenarios: the Face and Speaker Verification Evaluation organized by MOBIO (MObile BIOmetric consortium) and the results for the Speaker Recognition Evaluation 2010 organized by NIST. The core of our system is based on a Gaussian Mixture Model (GMM) and maximum likelihood (ML) framework. First, it extracts the important speech features by computing the Mel Frequency Cepstral Coefficients (MFCC). Then, the MFCCs train gender&#150;dependent GMMs that are later adapted to obtain target models. To obtain reliable performance statistics those target&#150;models evaluate a set of trials and final scores are calculated. Finally, those scores are tagged as target or impostor. We tried several system configurations and found that each database requires a specific tuning to improve the performance. For the MOBIO database we obtained an average equal error rate (EER) of 16.43 %. For the NIST 2010 database we accomplished an average EER of 16.61%. NIST2010 database considers various conditions. From those conditions, the interview training and testing conditions showed the best EER of 10.94 %, followed by the phone call training phone call testing conditions of 13.35%.</font></p>     <p align="justify"><font face="verdana" size="2"><b>Keywords:</b> Speaker verification and authentication.</font></p>     <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>     <p align="justify"><font face="verdana" size="2"><b>Resumen</b></font></p>     <p align="justify"><font face="verdana" size="2">Este documento muestra los resultados de nuestro sistema de verificaci&oacute;n de hablante bajo dos escenarios: la Evaluaci&oacute;n <i>Face and Speaker Verification Evaluation </i>organizada por MOBIO (MObile BIOmetric consortium) y la Evaluaci&oacute;n de Reconociemiento de personas 2010 organizada por NIST. La parte central de nuestro esquema se basa en un modelado de Mezclas de Gaussianas (GMM) y m&aacute;xima verosimilitud. Primero, se extraen los par&aacute;metros importantes de la voz calculando los coeficientes ceptrales en escala mel, Mel Frequency Cepstral Coefficients (MFCC). Despu&eacute;s, dichos MFFCs entrenan las mezclas de Gaussianas dependientes del g&eacute;nero que posteriormente ser&aacute;n adaptadas y se obtendr&aacute;n los modelos de los usuarios objetivo. Para obtener estad&iacute;sticas confiables esos modelos objetivo son evaluados por un conjunto de se&ntilde;ales no conocidas y se obtienen puntuaciones finales. Por &uacute;ltimo, esas puntuaciones son etiquetadas como usuario objetivo o impostor. Hemos analizado diferentes configuraciones y encontramos que cada base de datos requiere una sintonizaci&oacute;n adecuada para mejorar su desempe&ntilde;o. Para la base de datos MOBIO, obtuvimos un porcentaje de error promedio de 16.43 %. Para la base de datos NIST2010, logramos un promedio de error de 16.61%. La base de datos NIST2010 considera varias condiciones. De esas condiciones, la condici&oacute;n de entrevista para entrenamiento y prueba mostr&oacute; el mejor error con 10.94 %, seguida por la condici&oacute;n de llamada telef&oacute;nica en entrenamiento y llamada telef&oacute;nica en prueba con 13.35%.</font></p>     <p align="justify"><font face="verdana" size="2"><b>Palabras clave:</b> Verificaci&oacute;n de hablante y autenticaci&oacute;n.</font></p>     ]]></body>
<body><![CDATA[<p align="justify"><font face="verdana" size="2">&nbsp;</font></p>     <p align="justify"><font face="verdana" size="2"><a href="/pdf/cys/v15n1/v15n1a3.pdf" target="_blank">DESCARGAR ART&Iacute;CULO EN FORMATO PDF</a></font></p>     <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>     <p align="justify"><font face="verdana" size="2"><b>References</b></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>1. Bogert, B. P., Healy, M. J. R. &amp; Tukey, J. W. (1963). </b>The quefrency alanysis of time series for echoes: Cepstrum, pseudo&#150;autocovariance, cross&#150;cepstrum and saphe cracking. <i>Symposium on Time Series Analysis, </i>New York, USA, 209&#150;243.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053502&pid=S1405-5546201100030000300001&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>2. Burget, L., Fapso, M. Hubeika, V., Glembek, O., Karafi&aacute;t, M., Kockmann, M., Matejka, P., Schwarz, P., &amp; Cernocky, J. (2009).</b> But system for nist 2008 speaker recognition evaluation. <i>Interspeech 2009. </i>Brighton, Great Britain, 2335&#150;2338.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053504&pid=S1405-5546201100030000300002&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>3. Chen, S. S., &amp; Gopinath,  R. A. (2001). </b>Gaussianization. In Todd K. Leen, Thomas G. Dietterich,Volker Tresp (Eds.). <i>Advances in neural information processing systems 13, </i>(423&#150;429), Massachusetts, USA, The MIT Press.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053506&pid=S1405-5546201100030000300003&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     ]]></body>
<body><![CDATA[<!-- ref --><p align="justify"><font face="verdana" size="2"><b>4. Davis, S. &amp; Mermelstein, P. (1980). </b>Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. <i>IEEE Transactions on Acoustics, Speech, and Signal Processing, </i>28(4), 357&#150;366.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053508&pid=S1405-5546201100030000300004&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>5. Dempster, A. P., Laird, N. M., &amp; Rubin, D. B. (1977). </b>Maximum likelihood from incomplete data via the EM algorithm. <i>Journal of the Royal Statistical Society, Series B, </i>39 (1), 1&#150;38.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053510&pid=S1405-5546201100030000300005&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>6. Duda, R. O. </b>&amp; <b>Hart, P. E. (1973). </b><i>Pattern classification and scene analysis. </i>New York: Wiley.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053512&pid=S1405-5546201100030000300006&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>7. Furui, S. (1981). </b>Cepstral analysis techniques for automatic speaker verification. <i>IEEE Transactions on Acoustics, Speech, and Signal Processing, </i>29 (2), 254&#150;272.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053514&pid=S1405-5546201100030000300007&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>8. Gauvain, J. L. &amp; Lee, C. H. (1994). </b>Maximum a posteriori estimation for multivariate Gaussian mixture observations of markov chains. <i>IEEE Transactions on Speech and Audio Processing, </i>2 (2), 291&#150;298.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053516&pid=S1405-5546201100030000300008&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     ]]></body>
<body><![CDATA[<!-- ref --><p align="justify"><font face="verdana" size="2"><b>9. Hermansky, H., Morgan, N., Bayya, A., &amp; Kohn, P. (1992). </b>RASTA&#150;PLP speech analysis technique. <i>IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP&#150;92, </i>San Francisco, USA, 1, 121&#150;124.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053518&pid=S1405-5546201100030000300009&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>10. Lee, C.&#150;H. (1997). </b>A unified statistical hypothesis testing approach to speaker verification and verbal information verification. <i>Proceedings COST, Workshop on Speech Technology in the Public Telephone Network: Where are we today?, </i>Rhodes, Greece, 63&#150;72.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053520&pid=S1405-5546201100030000300010&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>11. Marcel, S., McCool, C., Matejka, P., Ahonen, T., Cernocky, J. (2010). </b><i>Mobile biometry (mobio) face and speaker verification evaluation. </i>Retrieved from <a href="http://publications.idiap.ch/index.php/publications/show/1848" target="_blank">http://publications.idiap.ch/index.php/publications/show/1848</a></font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053522&pid=S1405-5546201100030000300011&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2"><b>12. Mari&eacute;thoz, J. &amp; Bengio, S. (2005). </b>A unified framework for score normalization techniques applied to text&#150;independent speaker verification. <i>IEEE Signal Processing Letters, </i>12 (7), 532&#150;535.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053523&pid=S1405-5546201100030000300012&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>13. Martin, A.F. &amp; Greenberg, C.S. (2009). </b>NIST 2008 Speaker Recognition Evaluation: Performance Across Telephone and Room Microphone Channels. <i>Interspeech 2009, </i>Brighton, United Kingdom, 2579&#150;2582.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053525&pid=S1405-5546201100030000300013&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>14. McCool, C. &amp; Marcel, S. (2010). </b><i>Mobio database for the ICPR 2010 face and speech competition. </i>Retrieved from <a href="http://publications.idiap.ch/index.php/publications/show/1757" target="_blank">http://publications.idiap.ch/index.php/publications/show/1757</a></font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053527&pid=S1405-5546201100030000300014&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2"><b>15. Navratil, J. &amp; Ramaswamy, G.N. (2003). </b>The awe and mystery of t&#150;norm. <i>8<sup>th</sup> European Conference on Speech Communication and Technology, </i>Geneva, Switzerland, 2009&#150;2012.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053528&pid=S1405-5546201100030000300015&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>16. Pelecanos, J. &amp; Sridharan, S. (2001). </b>Feature warping for robust speaker verification. <i>A Speaker Odyssey&#150;The Speaker Recognition Workshop, </i>Crete, Greece, 213&#150;218.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053530&pid=S1405-5546201100030000300016&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>17. Petrovska&#150;Delacr&eacute;taz, D., Hannani, A. E., &amp; Chollet, G. (2007). </b>Text&#150;independent speaker verification: state of the art and challenges. <i>Progress in nonlinear speech processing. Lecture Notes in Computer Science, </i>4391, 135&#150;169.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053532&pid=S1405-5546201100030000300017&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>18. Reynolds, D.A. (1992). </b><i>A Gaussian mixture modeling approach to text&#150;independent speaker identification. </i>Ph.D. dissertation, Georgia Institute of Technology, Atlanta, Georgia, USA.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053534&pid=S1405-5546201100030000300018&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>19. Reynolds, D.A. (1995), </b>Speaker identification and verification using Gaussian mixture speaker models. <i>Speech Communication, </i>17 (1&#150;2), 91&#150;108.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053536&pid=S1405-5546201100030000300019&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     ]]></body>
<body><![CDATA[<!-- ref --><p align="justify"><font face="verdana" size="2"><b>20. Reynolds, D.A., Quatieri, T. F. &amp; Dunn, R. B. (2000). </b>Speaker verification using adapted Gaussian mixture models. <i>Digital Signal Processing, </i>10 (1&#150;3), 19&#150;41.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053538&pid=S1405-5546201100030000300020&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>21. </b>Speaker Recognition Evaluation. Retrieved from <a href="http://www.itl.nist.gov/iad/mig/tests/sre/" target="_blank">http://www.itl.nist.gov/iad/mig/tests/sre/</a></font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053540&pid=S1405-5546201100030000300021&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2"><b>22. </b><i>Speech processing, Transmission and Quality aspects (STQ); Distributed speech recognition; Front&#150;end feature extraction algorithm; Compression algorithms. </i>ETSI ES 201 108 V1.1.2 (2000&#150;04), 2000.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053541&pid=S1405-5546201100030000300022&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>23. Viikki, O. &amp; Laurila, K. (1998). </b>Cepstral domain segmental feature vector normalization for noise robust speech recognition. <i>Speech Communication&#150; Special issue on robust speech recognition, </i>25 (1&#150;3), 133&#150;147.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053543&pid=S1405-5546201100030000300023&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>24. Wald, A. (1947). </b><i>Sequential analysis. </i>New York: John Wiley and Sons.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053545&pid=S1405-5546201100030000300024&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>      ]]></body><back>
<ref-list>
<ref id="B1">
<label>1</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bogert]]></surname>
<given-names><![CDATA[B. P.]]></given-names>
</name>
<name>
<surname><![CDATA[Healy]]></surname>
<given-names><![CDATA[M. J. R.]]></given-names>
</name>
<name>
<surname><![CDATA[Tukey]]></surname>
<given-names><![CDATA[J. W.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[The quefrency alanysis of time series for echoes: Cepstrum, pseudo-autocovariance, cross-cepstrum and saphe cracking]]></article-title>
<source><![CDATA[]]></source>
<year>1963</year>
<conf-name><![CDATA[ Symposium on Time Series Analysis]]></conf-name>
<conf-loc>New York </conf-loc>
<page-range>209-243</page-range></nlm-citation>
</ref>
<ref id="B2">
<label>2</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Burget]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
<name>
<surname><![CDATA[Fapso]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Hubeika]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
<name>
<surname><![CDATA[Glembek]]></surname>
<given-names><![CDATA[O.]]></given-names>
</name>
<name>
<surname><![CDATA[Karafiát]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Kockmann]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Matejka]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Schwarz]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Cernocky]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[But system for nist 2008 speaker recognition evaluation]]></article-title>
<source><![CDATA[]]></source>
<year>2009</year>
<conf-name><![CDATA[ Interspeech]]></conf-name>
<conf-date>2009</conf-date>
<conf-loc>Brighton </conf-loc>
<page-range>2335-2338</page-range></nlm-citation>
</ref>
<ref id="B3">
<label>3</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chen]]></surname>
<given-names><![CDATA[S. S.]]></given-names>
</name>
<name>
<surname><![CDATA[Gopinath]]></surname>
<given-names><![CDATA[R. A.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Gaussianization]]></article-title>
<person-group person-group-type="editor">
<name>
<surname><![CDATA[Leen]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Dietterich]]></surname>
<given-names><![CDATA[Thomas G.]]></given-names>
</name>
<name>
<surname><![CDATA[Tresp]]></surname>
<given-names><![CDATA[Volker]]></given-names>
</name>
</person-group>
<source><![CDATA[Advances in neural information processing systems 13]]></source>
<year>2001</year>
<page-range>423-429</page-range><publisher-loc><![CDATA[Massachusetts ]]></publisher-loc>
<publisher-name><![CDATA[The MIT Press]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B4">
<label>4</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Davis]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Mermelstein]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences]]></article-title>
<source><![CDATA[IEEE Transactions on Acoustics, Speech, and Signal Processing]]></source>
<year>1980</year>
<volume>28</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>357-366</page-range></nlm-citation>
</ref>
<ref id="B5">
<label>5</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Dempster]]></surname>
<given-names><![CDATA[A. P.]]></given-names>
</name>
<name>
<surname><![CDATA[Laird]]></surname>
<given-names><![CDATA[N. M.]]></given-names>
</name>
<name>
<surname><![CDATA[Rubin]]></surname>
<given-names><![CDATA[D. B.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Maximum likelihood from incomplete data via the EM algorithm]]></article-title>
<source><![CDATA[Journal of the Royal Statistical Society, Series B]]></source>
<year>1977</year>
<volume>39</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>1-38</page-range></nlm-citation>
</ref>
<ref id="B6">
<label>6</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Duda]]></surname>
<given-names><![CDATA[R. O.]]></given-names>
</name>
<name>
<surname><![CDATA[Hart]]></surname>
<given-names><![CDATA[P. E.]]></given-names>
</name>
</person-group>
<source><![CDATA[Pattern classification and scene analysis]]></source>
<year>1973</year>
<publisher-loc><![CDATA[New York ]]></publisher-loc>
<publisher-name><![CDATA[Wiley]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B7">
<label>7</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Furui]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Cepstral analysis techniques for automatic speaker verification]]></article-title>
<source><![CDATA[IEEE Transactions on Acoustics, Speech, and Signal Processing]]></source>
<year>1981</year>
<volume>29</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>254-272</page-range></nlm-citation>
</ref>
<ref id="B8">
<label>8</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Gauvain]]></surname>
<given-names><![CDATA[J. L.]]></given-names>
</name>
<name>
<surname><![CDATA[Lee]]></surname>
<given-names><![CDATA[C. H.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Maximum a posteriori estimation for multivariate Gaussian mixture observations of markov chains]]></article-title>
<source><![CDATA[IEEE Transactions on Speech and Audio Processing]]></source>
<year>1994</year>
<volume>2</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>291-298</page-range></nlm-citation>
</ref>
<ref id="B9">
<label>9</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Hermansky]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Morgan]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Bayya]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Kohn]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[RASTA-PLP speech analysis technique]]></article-title>
<source><![CDATA[]]></source>
<year>1992</year>
<volume>1</volume>
<conf-name><![CDATA[ IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP-92]]></conf-name>
<conf-loc> </conf-loc>
<page-range>121-124</page-range><publisher-loc><![CDATA[San Francisco ]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B10">
<label>10</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Lee]]></surname>
<given-names><![CDATA[C.-H]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[A unified statistical hypothesis testing approach to speaker verification and verbal information verification]]></article-title>
<source><![CDATA[Proceedings COST, Workshop on Speech Technology in the Public Telephone Network: Where are we today?]]></source>
<year>1997</year>
<page-range>63-72</page-range><publisher-loc><![CDATA[Rhodes ]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B11">
<label>11</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Marcel]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[McCool]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Matejka]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Ahonen]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Cernocky]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
</person-group>
<source><![CDATA[Mobile biometry (mobio) face and speaker verification evaluation]]></source>
<year>2010</year>
</nlm-citation>
</ref>
<ref id="B12">
<label>12</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Mariéthoz]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Bengio]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[A unified framework for score normalization techniques applied to text-independent speaker verification]]></article-title>
<source><![CDATA[IEEE Signal Processing Letters]]></source>
<year>2005</year>
<volume>12</volume>
<numero>7</numero>
<issue>7</issue>
<page-range>532-535</page-range></nlm-citation>
</ref>
<ref id="B13">
<label>13</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Martin]]></surname>
<given-names><![CDATA[A.F.]]></given-names>
</name>
<name>
<surname><![CDATA[Greenberg]]></surname>
<given-names><![CDATA[C.S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[NIST 2008 Speaker Recognition Evaluation: Performance Across Telephone and Room Microphone Channels]]></article-title>
<source><![CDATA[Interspeech]]></source>
<year>2009</year>
<month>20</month>
<day>09</day>
<page-range>2579-2582</page-range><publisher-loc><![CDATA[Brighton ]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B14">
<label>14</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[McCool]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Marcel]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
</person-group>
<source><![CDATA[Mobio database for the ICPR 2010 face and speech competition]]></source>
<year>2010</year>
</nlm-citation>
</ref>
<ref id="B15">
<label>15</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Navratil]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Ramaswamy]]></surname>
<given-names><![CDATA[G.N.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[The awe and mystery of t-norm]]></article-title>
<source><![CDATA[]]></source>
<year>2003</year>
<conf-name><![CDATA[8 European Conference on Speech Communication and Technology]]></conf-name>
<conf-loc>Geneva </conf-loc>
<page-range>2009-2012</page-range></nlm-citation>
</ref>
<ref id="B16">
<label>16</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Pelecanos]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Sridharan]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Feature warping for robust speaker verification]]></article-title>
<source><![CDATA[]]></source>
<year>2001</year>
<conf-name><![CDATA[ A Speaker Odyssey-The Speaker Recognition Workshop]]></conf-name>
<conf-loc>Crete </conf-loc>
<page-range>213-218</page-range></nlm-citation>
</ref>
<ref id="B17">
<label>17</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Petrovska-Delacrétaz]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Hannani]]></surname>
<given-names><![CDATA[A. E.]]></given-names>
</name>
<name>
<surname><![CDATA[Chollet]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Text-independent speaker verification: state of the art and challenges]]></article-title>
<source><![CDATA[Progress in nonlinear speech processing. Lecture Notes in Computer Science]]></source>
<year>2007</year>
<volume>4391</volume>
<page-range>135-169</page-range></nlm-citation>
</ref>
<ref id="B18">
<label>18</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Reynolds]]></surname>
<given-names><![CDATA[D.A.]]></given-names>
</name>
</person-group>
<source><![CDATA[A Gaussian mixture modeling approach to text-independent speaker identification]]></source>
<year>1992</year>
</nlm-citation>
</ref>
<ref id="B19">
<label>19</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Reynolds]]></surname>
<given-names><![CDATA[D.A.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[, Speaker identification and verification using Gaussian mixture speaker models]]></article-title>
<source><![CDATA[Speech Communication]]></source>
<year>1995</year>
<volume>17</volume>
<numero>1</numero><numero>2</numero>
<issue>1</issue><issue>2</issue>
<page-range>91-108</page-range></nlm-citation>
</ref>
<ref id="B20">
<label>20</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Reynolds]]></surname>
<given-names><![CDATA[D.A.]]></given-names>
</name>
<name>
<surname><![CDATA[Quatieri]]></surname>
<given-names><![CDATA[T. F.]]></given-names>
</name>
<name>
<surname><![CDATA[Dunn]]></surname>
<given-names><![CDATA[R. B.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Speaker verification using adapted Gaussian mixture models]]></article-title>
<source><![CDATA[Digital Signal Processing]]></source>
<year>2000</year>
<volume>10</volume>
<numero>1</numero><numero>3</numero>
<issue>1</issue><issue>3</issue>
<page-range>19-41</page-range></nlm-citation>
</ref>
<ref id="B21">
<label>21</label><nlm-citation citation-type="">
<source><![CDATA[Speaker Recognition Evaluation]]></source>
<year></year>
</nlm-citation>
</ref>
<ref id="B22">
<label>22</label><nlm-citation citation-type="">
<source><![CDATA[Speech processing, Transmission and Quality aspects (STQ); Distributed speech recognition; Front-end feature extraction algorithm; Compression algorithms]]></source>
<year>2000</year>
</nlm-citation>
</ref>
<ref id="B23">
<label>23</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Viikki]]></surname>
<given-names><![CDATA[O.]]></given-names>
</name>
<name>
<surname><![CDATA[Laurila]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Cepstral domain segmental feature vector normalization for noise robust speech recognition]]></article-title>
<source><![CDATA[Speech Communication]]></source>
<year>1998</year>
<volume>25</volume>
<numero>1</numero><numero>3</numero>
<issue>1</issue><issue>3</issue>
<page-range>133-147</page-range></nlm-citation>
</ref>
<ref id="B24">
<label>24</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Wald]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<source><![CDATA[Sequential analysis]]></source>
<year>1947</year>
<publisher-loc><![CDATA[New York ]]></publisher-loc>
<publisher-name><![CDATA[John Wiley and Sons]]></publisher-name>
</nlm-citation>
</ref>
</ref-list>
</back>
</article>
