SciELO - Scientific Electronic Library Online

 
vol.15 número1Hacia métodos de análisis de datos espaciales raster en el nivel semánticoCombinación de disimilitudes para la clasificación de datos de tres vías índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Computación y Sistemas

versão impressa ISSN 1405-5546

Comp. y Sist. vol.15 no.1 México jul./set. 2011

 

Artículos

 

A New Phono–Articulatory Feature Representation for Language Identification in a Discriminative Framework

 

Nueva representación de características fono–articulatorias para identificación del idioma en un marco discriminativo

 

Oneisys Núñez Cuadra and José Ramón Calvo de Lara

 

Centro de Aplicaciones de Tecnologías de Avanzada, Cuba. E–mail: oneysita@yahoo.com, jcalvo@cenatav.co.cu

 

Article received on March 18, 2011.
Accepted on June 30, 2011.

 

Abstract

State of the Art language identification methods are based on acoustic or phonetic features. Recently, phono–articulatory features have been included as a new speech characteristic that conveys language information. Authors propose a new pho–no–articulatory representation of speech in a discriminative framework to identify languages. This simple representation shows good results discriminating between English and Spanish, using a reduced training set of phono–articulatory trigrams vectors.

Keywords: Phonetic features, articulatory features, language recognition and support vector machines.

 

Resumen

Los sistemas de identificación de idiomas en el estado del arte se basan en características acústicas o fonéticas. Recientemente, las características fono–articulatorias han sido incluidas como una nueva caracterización del habla que contiene información sobre el idioma. Los autores proponen una nueva representación fono–articulatoria del habla usando un marco discriminativo para identificar idiomas. Esta simple representación muestra buenos resultados en la discriminación entre inglés y español, usando un reducido conjunto de entrenamiento basado en vectores de trigramas fono–articulatorios.

Palabras clave: Características fonéticas, rasgos articulatorios, el reconocimiento del lenguaje y las máquinas de vectores soporte.

 

DESCARGAR ARTÍCULO EN FORMATO PDF

 

References

1. Burges, C. J. C. (1998). A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 2(2), 121–167        [ Links ]

2. Collobert, R. & Bengio, S. (2001). SVMTorch: Support vector machines for large–scale regression problems. The Journal of Machine Learning Research, 1(9), 143–160.         [ Links ]

3. Glembek, O., Matejka, P., Burget, L. & Mikolov, T. (2008). Advances in Phonotactic Language Recognition. Interspeech 2008, Brisbane, Australia, 743–746        [ Links ]

4. International Phonetic Association (1999). Handbook of the International Phonetic Association. A guide to the use of the International Phonetic Alphabet. Cambridge, U.K.: Cambridge University Press.         [ Links ]

5. Kanokphara, S. & Carson–Berndsen, J. (2006). Articulatory–Acoustic–Feature–based Automatic Language Identification. ISCA Workshop on Mul–tiingual Speech and Language Processing (MULTILING 2006), Stellenbosch, South Africa.         [ Links ]

6. Kirchhoff, K. ( 1999). Robust Speech Recognition Using Articulatory Information. Ph.D. Thesis, Universitat Bielefeld, Bielefeld, Germany.         [ Links ]

7. Kirchhoff, K., Fink, G. A. & Sagerer, G (2002). Combining acoustic and articulatory feature information for robust speech recognition. Speech Communication, 37 (3–4) 303–319.         [ Links ]

8. Muthusamy, Y. K., Cole, R. A. & Oshika, B. T. (1992). The OGI multilanguage telephone speech corpus. International Conference on Spoken Language Processing, Alberta, Canada, 895–898.         [ Links ]

9. Parandekar, S. & Kirchhoff, K. (2003). Multi–stream language identification using data–driven dependency selection. 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, China, I–28– I–31        [ Links ]

10. Singer, E., Torres–Carrasquillo, P.A., Gleason, T.P., Campbell, W.M. & Reynolds, D.A. (2003). Acoustic, Phonetic, and Discriminative Approaches to Automatic Language Identification.8th European Conference on Speech Communication and Technology (EUROSPEECH 2003), Geneva, Switzerland, 1345–1348        [ Links ]

11. Stüker, S., Metze, F., Schultz, T. & Waibel, A. (2003). Integrating Multilingual Articulatory Features into Speech Recognition. 8th European Conference on Speech Communication and Technology (EUROSPEECH 2003). Geneva, Switzerland, 1033–1036        [ Links ]

12. The 2009 NIST Language Recognition Evaluation (August 11, 2009) Retrieved from http://www.itl.nist.gov/iad/mig/7tests/lre/2009/lre09_eval_results/index.html.         [ Links ]

13. Torres–Carrasquillo, P. A., Singer, E., Kohler, M.A., Greene, R.J., Reynolds, D.A. & Deller Jr., J. R. (2002). Approaches to Language Identification Using Gaussian Mixture Models and Shifted Delta Cepstral Features. 7th International Conference on Spoken Language Processing, Denver, CO, ISCA, 89–92.         [ Links ]

14. Wrench, A. (1999). MOCHA–TIMIT Retrieved from http://www.cstr.ed.ac.uk/research/projects/artic/mocha.html.         [ Links ]