<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1405-5546</journal-id>
<journal-title><![CDATA[Computación y Sistemas]]></journal-title>
<abbrev-journal-title><![CDATA[Comp. y Sist.]]></abbrev-journal-title>
<issn>1405-5546</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Investigación en Computación]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1405-55462011000300004</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[Speaker Verification on Summed-Channel Conditions with Confidence Measures]]></article-title>
<article-title xml:lang="es"><![CDATA[Verificación de locutor en condiciones de canal sumado con medidas de confianza]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Avilés Casco]]></surname>
<given-names><![CDATA[Carlos Vaquero]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Villalba López]]></surname>
<given-names><![CDATA[Jesús]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Ortega Giménez]]></surname>
<given-names><![CDATA[Alfonso]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Lleida Solano]]></surname>
<given-names><![CDATA[Eduardo]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
</contrib-group>
<aff id="A01">
<institution><![CDATA[,University of Zaragoza Aragón Institute for Engineering Research Communications Technology Group]]></institution>
<addr-line><![CDATA[ ]]></addr-line>
<country>Spain</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>09</month>
<year>2011</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>09</month>
<year>2011</year>
</pub-date>
<volume>15</volume>
<numero>1</numero>
<fpage>27</fpage>
<lpage>37</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1405-55462011000300004&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1405-55462011000300004&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1405-55462011000300004&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[This paper addresses the problem of speaker verification in two speaker conversations, proposing a set of confidence measures to assess the quality of a given speaker segmentation. We study how these measures can be used to estimate the performance of a state-of-the-art speaker verification system, the I3A submission for the core-summed condition in the NIST SRE 2010. We present a Factor Analysis based speaker segmentation system, along with three confidence measures that are fused to obtain a single measure that we show to constitute a good estimation of the segmentation accuracy, when evaluated on the summed-channel telephone data of the NIST SRE 2008. Finally we present speaker verification results obtained with the I3A submission for the NIST SRE 2010 on several conditions of this evaluation, involving summed-channel. We show that the confidence measure also predicts the performance of a state-of-the art speaker verification system when it faces two speaker conversations.]]></p></abstract>
<abstract abstract-type="short" xml:lang="es"><p><![CDATA[Este artículo trata el problema de verificación de locutor en conversaciones con dos locutores, proponiendo un conjunto de medidas de confianza para evaluar la calidad de una segmentación de locutores dada. Estudiamos cómo estas medidas pueden ser utilizadas para estimar el rendimiento de un sistema de verificación del locutor del estado del arte, el sistema del I3A para la evaluación de reconocimiento del locutor NIST SRE 2010. Presentamos un sistema de segmentación de locutor basado en Análisis Factorial y tres medidas de confianza que son combinadas en una medida que constituye una buena estimación de la calidad de la segmentación, cuando se evalúa en las grabaciones de canal sumado de la NIST SRE 2008. Finalmente presentamos resultados de verificación de locutor obtenidos con el sistema del I3A en distintas condiciones de canal sumado de la NIST SRE 2010. Se demuestra que las medidas de confianza también predicen el rendimiento de un sistema de verificación del locutor cuando se enfrenta a conversaciones de dos locutores.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[Confidence measures]]></kwd>
<kwd lng="en"><![CDATA[speaker segmentation]]></kwd>
<kwd lng="en"><![CDATA[speaker verification and telephone conversations]]></kwd>
<kwd lng="es"><![CDATA[Medidas de confianza]]></kwd>
<kwd lng="es"><![CDATA[segmentación de locutor]]></kwd>
<kwd lng="es"><![CDATA[verificación de locutor y conversaciones telefónicas]]></kwd>
</kwd-group>
</article-meta>
</front><body><![CDATA[ <p align="justify"><font face="verdana" size="4">Art&iacute;culos</font></p>     <p align="justify"><font face="verdana" size="4">&nbsp;</font></p>     <p align="center"><font face="verdana" size="4"><b>Speaker Verification on Summed&#150;Channel Conditions with Confidence Measures</b></font></p>     <p align="center"><font face="verdana" size="2">&nbsp;</font></p>     <p align="center"><font face="verdana" size="3"><b>Verificaci&oacute;n de locutor en condiciones de canal sumado con medidas de confianza</b></font></p>     <p align="center"><font face="verdana" size="2">&nbsp;</font></p>     <p align="center"><font face="verdana" size="2"><b>Carlos Vaquero Avil&eacute;s Casco, Jes&uacute;s Villalba L&oacute;pez, Alfonso Ortega Gim&eacute;nez, and Eduardo Lleida Solano</b></font></p>     <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>     <p align="justify"><font face="verdana" size="2"><i>Communications Technology Group (GTC), Arag&oacute;n Institute for Engineering Research (I3A), University of Zaragoza, Spain. E&#150;mail: </i><a href="mailto:cvaquero@unizar.es">cvaquero@unizar.es</a>, <a href="mailto:villalba@unizar.es">villalba@unizar.es</a>, <a href="mailto:villalba@unizar.es">ortega@unizar.es</a>, <a href="mailto:lleida@unizar.es">lleida@unizar.es</a></font></p>     <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>     ]]></body>
<body><![CDATA[<p align="justify"><font face="verdana" size="2">Article received on July 30, 2010.     <br> Accepted on January 15, 2011.</font></p>     <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>     <p align="justify"><font face="verdana" size="2"><b>Abstract</b></font></p>     <p align="justify"><font face="verdana" size="2">This paper addresses the problem of speaker verification in two speaker conversations, proposing a set of confidence measures to assess the quality of a given speaker segmentation. We study how these measures can be used to estimate the performance of a state&#150;of&#150;the&#150;art speaker verification system, the I3A submission for the core&#150;summed condition in the NIST SRE 2010. We present a Factor Analysis based speaker segmentation system, along with three confidence measures that are fused to obtain a single measure that we show to constitute a good estimation of the segmentation accuracy, when evaluated on the summed&#150;channel telephone data of the NIST SRE 2008. Finally we present speaker verification results obtained with the I3A submission for the NIST SRE 2010 on several conditions of this evaluation, involving summed&#150;channel. We show that the confidence measure also predicts the performance of a state&#150;of&#150;the art speaker verification system when it faces two speaker conversations. </font></p>     <p align="justify"><font face="verdana" size="2"><b>Keywords:</b> Confidence measures, speaker segmentation, speaker verification and telephone conversations.</font></p>     <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>     <p align="justify"><font face="verdana" size="2"><b>Resumen</b></font></p>     <p align="justify"><font face="verdana" size="2">Este art&iacute;culo trata el problema de verificaci&oacute;n de locutor en conversaciones con dos locutores, proponiendo un conjunto de medidas de confianza para evaluar la calidad de una segmentaci&oacute;n de locutores dada. Estudiamos c&oacute;mo estas medidas pueden ser utilizadas para estimar el rendimiento de un sistema de verificaci&oacute;n del locutor del estado del arte, el sistema del I3A para la evaluaci&oacute;n de reconocimiento del locutor NIST SRE 2010. Presentamos un sistema de segmentaci&oacute;n de locutor basado en An&aacute;lisis Factorial y tres medidas de confianza que son combinadas en una medida que constituye una buena estimaci&oacute;n de la calidad de la segmentaci&oacute;n, cuando se eval&uacute;a en las grabaciones de canal sumado de la NIST SRE 2008. Finalmente presentamos resultados de verificaci&oacute;n de locutor obtenidos con el sistema del I3A en distintas condiciones de canal sumado de la NIST SRE 2010. Se demuestra que las medidas de confianza tambi&eacute;n predicen el rendimiento de un sistema de verificaci&oacute;n del locutor cuando se enfrenta a conversaciones de dos locutores.</font></p>     <p align="justify"><font face="verdana" size="2"><b>Palabras clave:</b> Medidas de confianza, segmentaci&oacute;n de locutor, verificaci&oacute;n de locutor y conversaciones telef&oacute;nicas.</font></p>     ]]></body>
<body><![CDATA[<p align="justify"><font face="verdana" size="2">&nbsp;</font></p>     <p align="justify"><font face="verdana" size="2"><a href="/pdf/cys/v15n1/v15n1a4.pdf" target="_blank">DESCARGAR ART&Iacute;CULO EN FORMATO PDF</a></font></p>     <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>     <p align="justify"><font face="verdana" size="2"><b>Acknowledgements</b></font></p>     <p align="justify"><font face="verdana" size="2">This work was supported by project TIN2008&#150;06856&#150;C05&#150;04 and FPU program of MEC of the Spanish government.</font></p>     <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>     <p align="justify"><font face="verdana" size="2"><b>References</b></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>1. Bogert, B. P., Healy, M. J. R. &amp; Tukey, J. W. (1963). </b>The quefrency alanysis of time series for echoes: Cepstrum, pseudo&#150;autocovariance, cross&#150;cepstrum and saphe cracking. <i>Symposium on Time Series Analysis, </i>New York, USA, 209&#150;243.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053602&pid=S1405-5546201100030000400001&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>2. Burget, L., Fapso, M. Hubeika, V., Glembek, O., Karafi&aacute;t, M., Kockmann, M., Matejka, P., Schwarz, P., &amp; Cernocky, J. (2009). </b>But system for nist 2008 speaker recognition evaluation. <i>Interspeech 2009. </i>Brighton, Great Britain, 2335&#150;2338.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053604&pid=S1405-5546201100030000400002&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>3. Chen, S. S., &amp; Gopinath, R. A. (2001). </b>Gaussianization. In Todd K. Leen, Thomas G. Dietterich,Volker Tresp (Eds.). <i>Advances in neural information processing systems 13, </i>(423&#150;429), Massachusetts, USA, The MIT Press.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053606&pid=S1405-5546201100030000400003&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>4. Davis, S. &amp; Mermelstein, P. (1980). </b>Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. <i>IEEE Transactions on Acoustics, Speech, and Signal Processing, </i>28(4), 357&#150;366.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053608&pid=S1405-5546201100030000400004&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>5. Dempster, A. P., Laird, N. M., &amp; Rubin, D. B. (1977). </b>Maximum likelihood from incomplete data via the EM algorithm. <i>Journal ofthe Royal Statistical Society, Series B, </i>39 (1), 1&#150;38.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053610&pid=S1405-5546201100030000400005&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>6. Duda, R. O. </b>&amp; <b>Hart, P. E. (1973). </b><i>Pattern classification and scene analysis. </i>New York: Wiley.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053612&pid=S1405-5546201100030000400006&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>7. Furui, S. (1981). </b>Cepstral analysis techniques for automatic speaker verification. <i>IEEE Transactions on Acoustics, Speech, and Signal Processing, </i>29 (2), 254&#150;272.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053614&pid=S1405-5546201100030000400007&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>8. Gauvain, J. L. &amp; Lee, C. H. (1994). </b>Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. <i>IEEE Transactions on Speech and Audio Processing, </i>2 (2), 291&#150;298.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053616&pid=S1405-5546201100030000400008&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>9. Hermansky, H., Morgan, N., Bayya, A., &amp; Kohn, P. (1992). </b>RASTA&#150;PLP speech analysis technique. <i>IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP&#150;92, </i>San Francisco, USA, 1, 121&#150;124.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053618&pid=S1405-5546201100030000400009&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>10. Lee, C.&#150;H. (1997). </b>A unified statistical hypothesis testing approach to speaker verification and verbal information&nbsp;verification. <i>Proceedings COST,Workshop on Speech Technology in the Public Telephone Network: Where are we today?, </i>Rhodes, Greece, 63&#150;72.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053620&pid=S1405-5546201100030000400010&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>11. Marcel, S., McCool, C., Matejka, P., Ahonen, T., Cernocky, J. (2010). </b><i>Mobile biometry (mobio) face and speaker verification evaluation. </i>Retrieved from <a href="publications.idiap.ch/index.php/publications/show/1848" target="_blank">http://publications.idiap.ch/index.php/publications/show/1848</a></font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053622&pid=S1405-5546201100030000400011&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2"><b>12. Mari&eacute;thoz, J. &amp; Bengio, S. (2005). </b>A unified framework for score normalization techniques applied to text&#150;independent speaker verification. <i>IEEE Signal Processing Letters, </i>12 (7), 532&#150;535.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053623&pid=S1405-5546201100030000400012&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     ]]></body>
<body><![CDATA[<!-- ref --><p align="justify"><font face="verdana" size="2"><b>13. Martin, A.F. &amp; Greenberg, C.S. (2009). </b>NIST 2008 Speaker Recognition Evaluation: Performance across Telephone and Room Microphone Channels. <i>Interspeech 2009, </i>Brighton, United Kingdom, 2579&#150;2582.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053625&pid=S1405-5546201100030000400013&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>14. McCool, C. &amp; Marcel, S. (2010). </b><i>Mobio database for the ICPR 2010 face and speech competition. </i>Retrieved from <a href="http://publications.idiap.ch/index.php/publications/show/1757" target="_blank">http://publications.idiap.ch/index.php/publications/show/1757</a></font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053627&pid=S1405-5546201100030000400014&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2"><b>15. Navratil, J. &amp; Ramaswamy, G.N. (2003). </b>The awe and mystery of t&#150;norm. <i>8<sup>th</sup> European Conference on Speech Communication and Technology, </i>Geneva, Switzerland, 2009&#150;2012.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053628&pid=S1405-5546201100030000400015&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>16. Pelecanos, J. &amp; Sridharan, S. (2001). </b>Feature warping for robust speaker verification. <i>A Speaker Odyssey&#150;The Speaker Recognition Workshop, </i>Crete, Greece, 213&#150;218.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053630&pid=S1405-5546201100030000400016&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>17. Petrovska&#150;Delacr&eacute;taz, D., Hannani, A. E., &amp; Chollet, G. (2007). </b>Text&#150;independent speaker verification: state of the art and challenges. <i>Progress in nonlinear speech processing. Lecture Notes in Computer Science, </i>4391, 135&#150;169.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053632&pid=S1405-5546201100030000400017&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>18. Reynolds, D.A. (1992). </b><i>A Gaussian mixture modeling approach to text&#150;independent speaker identification. </i>Ph.D. dissertation, Georgia Institute of Technology, Atlanta, Georgia, USA.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053634&pid=S1405-5546201100030000400018&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>19. Reynolds, D.A. (1995), </b>Speaker identification and verification using Gaussian mixture speaker models. <i>Speech Communication, </i>17 (1&#150;2), 91&#150;108.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053636&pid=S1405-5546201100030000400019&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>20. Reynolds, D.A., Quatieri, T. F. &amp; Dunn, R. B. (2000). </b>Speaker verification using adapted Gaussian mixture models. <i>Digital Signal Processing, </i>10 (1&#150;3), 19&#150;41.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053638&pid=S1405-5546201100030000400020&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>21. </b>Speaker Recognition Evaluation. Retrieved from <a href="http://www.itl.nist.gov/iad/mig/tests/sre/" target="_blank">http://www.itl.nist.gov/iad/mig/tests/sre/</a></font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053640&pid=S1405-5546201100030000400021&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2"><b>22. </b><i>Speech processing, Transmission and Quality aspects (STQ); Distributed speech recognition; Front&#150;end feature extraction algorithm; Compression algorithms. </i>ETSI ES 201 108 V1.1.2 (2000&#150;04), 2000.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053641&pid=S1405-5546201100030000400022&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     <!-- ref --><p align="justify"><font face="verdana" size="2"><b>23. Viikki, O. &amp; Laurila, K. (1998). </b>Cepstral domain segmental feature vector normalization for noise robust speech recognition. <i>Speech Communication&#150;Special issue on robust speech recognition, </i>25 (1&#150;3), 133&#150;147.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053643&pid=S1405-5546201100030000400023&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>     ]]></body>
<body><![CDATA[<!-- ref --><p align="justify"><font face="verdana" size="2"><b>24. Wald, A. (1947). </b><i>Sequential analysis. </i>New York: John Wiley and Sons.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2053645&pid=S1405-5546201100030000400024&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>      ]]></body><back>
<ref-list>
<ref id="B1">
<label>1</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bogert]]></surname>
<given-names><![CDATA[B. P.]]></given-names>
</name>
<name>
<surname><![CDATA[Healy]]></surname>
<given-names><![CDATA[M. J. R.]]></given-names>
</name>
<name>
<surname><![CDATA[Tukey]]></surname>
<given-names><![CDATA[J. W.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[The quefrency alanysis of time series for echoes: Cepstrum, pseudo-autocovariance, cross-cepstrum and saphe cracking]]></article-title>
<source><![CDATA[]]></source>
<year>1963</year>
<conf-name><![CDATA[ Symposium on Time Series Analysis]]></conf-name>
<conf-loc>New York </conf-loc>
<page-range>209-243</page-range></nlm-citation>
</ref>
<ref id="B2">
<label>2</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Burget]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
<name>
<surname><![CDATA[Fapso]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Hubeika]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
<name>
<surname><![CDATA[Glembek]]></surname>
<given-names><![CDATA[O.]]></given-names>
</name>
<name>
<surname><![CDATA[Karafiát]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Kockmann]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Matejka]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Schwarz]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Cernocky]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[But system for nist 2008 speaker recognition evaluation]]></article-title>
<source><![CDATA[]]></source>
<year>2009</year>
<conf-name><![CDATA[ Interspeech]]></conf-name>
<conf-date>2009</conf-date>
<conf-loc>Brighton </conf-loc>
<page-range>2335-2338</page-range></nlm-citation>
</ref>
<ref id="B3">
<label>3</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chen]]></surname>
<given-names><![CDATA[S. S.]]></given-names>
</name>
<name>
<surname><![CDATA[Gopinath]]></surname>
<given-names><![CDATA[R. A.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Gaussianization]]></article-title>
<person-group person-group-type="editor">
<name>
<surname><![CDATA[Leen]]></surname>
<given-names><![CDATA[Todd K.]]></given-names>
</name>
<name>
<surname><![CDATA[Dietterich]]></surname>
<given-names><![CDATA[Thomas G.]]></given-names>
</name>
<name>
<surname><![CDATA[Tresp]]></surname>
<given-names><![CDATA[Volker]]></given-names>
</name>
</person-group>
<source><![CDATA[Advances in neural information processing systems 13]]></source>
<year>2001</year>
<page-range>423-429</page-range><publisher-loc><![CDATA[Massachusetts ]]></publisher-loc>
<publisher-name><![CDATA[The MIT Press]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B4">
<label>4</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Davis]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Mermelstein]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences]]></article-title>
<source><![CDATA[IEEE Transactions on Acoustics, Speech, and Signal Processing]]></source>
<year>1980</year>
<volume>28</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>357-366</page-range></nlm-citation>
</ref>
<ref id="B5">
<label>5</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Dempster]]></surname>
<given-names><![CDATA[A. P.]]></given-names>
</name>
<name>
<surname><![CDATA[Laird]]></surname>
<given-names><![CDATA[N. M.]]></given-names>
</name>
<name>
<surname><![CDATA[Rubin]]></surname>
<given-names><![CDATA[D. B.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Maximum likelihood from incomplete data via the EM algorithm]]></article-title>
<source><![CDATA[Journal ofthe Royal Statistical Society, Series B]]></source>
<year>1977</year>
<volume>39</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>1-38</page-range></nlm-citation>
</ref>
<ref id="B6">
<label>6</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Duda]]></surname>
<given-names><![CDATA[R. O.]]></given-names>
</name>
<name>
<surname><![CDATA[Hart]]></surname>
<given-names><![CDATA[P. E.]]></given-names>
</name>
</person-group>
<source><![CDATA[Pattern classification and scene analysis]]></source>
<year>1973</year>
<publisher-loc><![CDATA[New York ]]></publisher-loc>
<publisher-name><![CDATA[Wiley]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B7">
<label>7</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Furui]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Cepstral analysis techniques for automatic speaker verification]]></article-title>
<source><![CDATA[IEEE Transactions on Acoustics, Speech, and Signal Processing]]></source>
<year>1981</year>
<volume>29</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>254-272</page-range></nlm-citation>
</ref>
<ref id="B8">
<label>8</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA["]Gauvain]]></surname>
<given-names><![CDATA[J. L.]]></given-names>
</name>
<name>
<surname><![CDATA[Lee]]></surname>
<given-names><![CDATA[C. H.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains]]></article-title>
<source><![CDATA[IEEE Transactions on Speech and Audio Processing]]></source>
<year>1994</year>
<volume>2</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>291-298</page-range></nlm-citation>
</ref>
<ref id="B9">
<label>9</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Hermansky]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Morgan]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Bayya]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Kohn]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[RASTA-PLP speech analysis technique]]></article-title>
<source><![CDATA[]]></source>
<year>1992</year>
<volume>1</volume>
<conf-name><![CDATA[ IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP-92]]></conf-name>
<conf-loc>San Francisco </conf-loc>
<page-range>121-124</page-range></nlm-citation>
</ref>
<ref id="B10">
<label>10</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Lee]]></surname>
<given-names><![CDATA[C.-H.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[A unified statistical hypothesis testing approach to speaker verification and verbal information verification]]></article-title>
<source><![CDATA[]]></source>
<year>1997</year>
<conf-name><![CDATA[ Proceedings COST,Workshop on Speech Technology in the Public Telephone Network: Where are we today?]]></conf-name>
<conf-loc>Rhodes </conf-loc>
<page-range>63-72</page-range></nlm-citation>
</ref>
<ref id="B11">
<label>11</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Marcel]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[McCool]]></surname>
</name>
<name>
<surname><![CDATA[Matejka]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Ahonen]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Cernocky]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
</person-group>
<source><![CDATA[Mobile biometry (mobio) face and speaker verification evaluation]]></source>
<year>2010</year>
</nlm-citation>
</ref>
<ref id="B12">
<label>12</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Mariéthoz]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Bengio]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[A unified framework for score normalization techniques applied to text-independent speaker verification]]></article-title>
<source><![CDATA[IEEE Signal Processing Letters]]></source>
<year>2005</year>
<volume>12</volume>
<numero>7</numero>
<issue>7</issue>
<page-range>532-535</page-range></nlm-citation>
</ref>
<ref id="B13">
<label>13</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Martin]]></surname>
<given-names><![CDATA[A.F.]]></given-names>
</name>
<name>
<surname><![CDATA[Greenberg]]></surname>
<given-names><![CDATA[C.S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[NIST 2008 Speaker Recognition Evaluation: Performance across Telephone and Room Microphone Channels]]></article-title>
<source><![CDATA[Interspeech]]></source>
<year>2009</year>
<month>20</month>
<day>09</day>
<page-range>2579-2582</page-range><publisher-loc><![CDATA[Brighton ]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B14">
<label>14</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[McCool]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Marcel]]></surname>
<given-names><![CDATA[S]]></given-names>
</name>
</person-group>
<source><![CDATA[Mobio database for the ICPR 2010 face and speech competition]]></source>
<year>2010</year>
</nlm-citation>
</ref>
<ref id="B15">
<label>15</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Navratil]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Ramaswamy]]></surname>
<given-names><![CDATA[G.N.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[The awe and mystery of t-norm]]></article-title>
<source><![CDATA[]]></source>
<year>2003</year>
<conf-name><![CDATA[8 European Conference on Speech Communication and Technology]]></conf-name>
<conf-loc>Geneva </conf-loc>
</nlm-citation>
</ref>
<ref id="B16">
<label>16</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Pelecanos]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Sridharan]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Feature warping for robust speaker verification]]></article-title>
<source><![CDATA[]]></source>
<year>2001</year>
<conf-name><![CDATA[ A Speaker Odyssey-The Speaker Recognition Workshop]]></conf-name>
<conf-loc>Crete </conf-loc>
<page-range>213-218</page-range></nlm-citation>
</ref>
<ref id="B17">
<label>17</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Petrovska-Delacrétaz]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Hannani]]></surname>
<given-names><![CDATA[A. E.]]></given-names>
</name>
<name>
<surname><![CDATA[Chollet]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Text-independent speaker verification: state of the art and challenges]]></article-title>
<source><![CDATA[Progress in nonlinear speech processing. Lecture Notes in Computer Science]]></source>
<year>2007</year>
<volume>4391</volume>
<page-range>135-169</page-range></nlm-citation>
</ref>
<ref id="B18">
<label>18</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Reynolds]]></surname>
<given-names><![CDATA[D.A.]]></given-names>
</name>
</person-group>
<source><![CDATA[A Gaussian mixture modeling approach to text-independent speaker identification]]></source>
<year>1992</year>
</nlm-citation>
</ref>
<ref id="B19">
<label>19</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Reynolds]]></surname>
<given-names><![CDATA[D.A.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Speaker identification and verification using Gaussian mixture speaker models]]></article-title>
<source><![CDATA[Speech Communication]]></source>
<year>1995</year>
<volume>17</volume>
<numero>1</numero><numero>2</numero>
<issue>1</issue><issue>2</issue>
<page-range>91-108</page-range></nlm-citation>
</ref>
<ref id="B20">
<label>20</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Reynolds]]></surname>
<given-names><![CDATA[D.A.]]></given-names>
</name>
<name>
<surname><![CDATA[Quatieri]]></surname>
<given-names><![CDATA[T. F.]]></given-names>
</name>
<name>
<surname><![CDATA[Dunn]]></surname>
<given-names><![CDATA[R. B.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Speaker verification using adapted Gaussian mixture models]]></article-title>
<source><![CDATA[Digital Signal Processing]]></source>
<year>2000</year>
<volume>10</volume>
<numero>1</numero><numero>3</numero>
<issue>1</issue><issue>3</issue>
<page-range>19-41</page-range></nlm-citation>
</ref>
<ref id="B21">
<label>21</label><nlm-citation citation-type="">
<source><![CDATA[Speaker Recognition Evaluation]]></source>
<year></year>
</nlm-citation>
</ref>
<ref id="B22">
<label>22</label><nlm-citation citation-type="">
<source><![CDATA[Speech processing, Transmission and Quality aspects (STQ); Distributed speech recognition; Front-end feature extraction algorithm; Compression algorithms]]></source>
<year>2000</year>
</nlm-citation>
</ref>
<ref id="B23">
<label>23</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Viikki]]></surname>
<given-names><![CDATA[O.]]></given-names>
</name>
<name>
<surname><![CDATA[Laurila]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Cepstral domain segmental feature vector normalization for noise robust speech recognition]]></article-title>
<source><![CDATA[Speech Communication]]></source>
<year>1998</year>
<volume>25</volume>
<numero>1</numero><numero>3</numero>
<issue>1</issue><issue>3</issue>
<page-range>133-147</page-range></nlm-citation>
</ref>
<ref id="B24">
<label>24</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Wald]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<source><![CDATA[Sequential analysis]]></source>
<year>1947</year>
<publisher-loc><![CDATA[New York ]]></publisher-loc>
<publisher-name><![CDATA[John Wiley and Sons]]></publisher-name>
</nlm-citation>
</ref>
</ref-list>
</back>
</article>
