SciELO - Scientific Electronic Library Online

 
vol.15 número1EditorialVerificación de hablante en diferentes escenarios de base de datos índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Computación y Sistemas

versão impressa ISSN 1405-5546

Comp. y Sist. vol.15 no.1 México jul./set. 2011

 

Artículos

 

Development of Voice–Based Tools for Accessibility to Computer Services

 

Desarrollo de herramientas de accesibilidad al ordenador basadas en la voz

 

Oscar Saz Torralba, William Ricardo Rodríguez Dueñas, and Eduardo Lleida Solano

 

Communications Technology Group (GTC), Aragón Institute for Engineering Research (I3A), University of Zaragoza, Zaragoza, Spain. E–mail: oskarsaz@unizar.es, wricardo@unizar.es, lleida@unizar.es

 

Article received on July 30, 2010.
Accepted on January 15, 2011.

 

Abstract

This work presents the development of two tools which intend to provide accessibility to people with different handicaps in their access to computer applications using speech technologies. The basis of the work is to use voice emissions from a severely handicapped user to substitute mouse and key strokes in one of the tools; and the movement of the cursor in the other. The speech technologies required for these tasks are robust energy estimation and robust formant calculation and normalization. The paper will also provide a comprehensive view of all the process required for a successful development of these tools, starting with the contact with assistive and education institutions, understanding the difficulties of these collectives in their everyday life, defining how technology can help in these cases, the actual development of the tools and, finally, the deployment of the tools with real users to assess their usability and functionality.

Keyword: Voice I/O, user–centered design, signal analysis synthesis and processing, speech processing, universal accessibility and handicapped aids.

 

Resumen

Este trabajo presenta el desarrollo de dos herramientas que buscan proveer accesibilidad a personas con diferentes discapacidades en su acceso a aplicaciones informáticas a través de las tecnologías del habla. La base del trabajo es usar emisiones de voz por parte de un usuario con discapacidad severa para sustituir las pulsaciones de ratón o teclado en una de las herramientas; y el movimiento del cursor en la otra. Las tecnologías del habla requeridas para ello son estimación de energía robusta y cálculo robusto de formantes con normalización. El artículo también buscar dar una visión global del proceso requerido para el desarrollo exitoso de estas herramientas, empezando por el contacto con instituciones asistivas y educativas, entender las dificultades de estos colectivos, definir qué soluciones puede dar la tecnología, el propio desarrollo de las herramientas y, finalmente, el despliegue de las mismas ante usuarios reales para conocer su usabilidad y funcionalidad.

Palabras clave: Voz E/S, diseño centrado en el usuario, análisis síntesis y procesado de señal, procesado de voz, accesibilidad universal y ayudas a discapacidad.

 

DESCARGAR ARTÍCULO EN FORMATO PDF

 

Acknowledgements

The authors want to thank the following people: Antonio Escartín for his work; Marta Peña and the staff from CADIS–Huesca; Verónica Bermúdez, Laura Abarca, Sara Mejuto and employees and users of ASPACE–Huesca. This work was supported by project TIN2008–06856–C05–04 from MEC of the Spanish government, the Santander Bank Scholarship, and a collaboration project between CADIS–Huesca and the University of Zaragoza founded by Caja de Ahorros de la Inmaculada (CAI), Instituto Aragonés de Servicios Sociales (IASS) and Diputación Provincial de Huesca (DPH).

 

Referencias

1. Bilmes, J.–A., Malkin, J., Li, X., Harada, S., Kilanski, K., Kirchhoff, K., Wright, R., Subramanya, A., Landay, J.–A., Dowden, P. & Chizeck, H. (2006). The Vocal Joystick. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2006), Toulouse, France, 1, 625–628.         [ Links ]

2. Necioglu, B.F., Clements, M.A., & Barnwell, T.P. (2000). Unsupervised Estimation of the Human Vocal Tract Length over Sentence Level Utterances. 2000 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2000), Istanbul, Turkey, 3, 1319–1322.         [ Links ]

3. Creer, S. M., Green, P. D., Cunningham, S. P. & Fatema, K. (2009). Personalizing Synthetic Voices for People with Progressive Speech Disorders: Judging Voice Similarity. 10th Annual Conference of the International Speech Communication Association (INTERSPEECH 2009), Brighton, United Kingdom, 1427–1430.         [ Links ]

4. Cucchiarini, C., Lembrechts, D. & Strik, H. (2008). HLT and Communicative Disabilities: The Need for Co–Operation between Government, Industry and Academia. Language and Speech Technology Conference (LangTech 2008), Rome, Italy, 125–128.         [ Links ]

5. Ephraim, Y. & Malah, D. (1984). Speech Enhancement Using a Minimum Mean–Square Error Short–Time Spectral Amplitude Estimator. IEEE Transactions on Acoustic, Speech Signal Processing, 32(6), 1109–1121.         [ Links ]

6. Ephraim, Y. & Malah, D. (1985). Speech Enhancement Using A Minimum Mean–Square Error Log–Spectral Amplitude Estimator. IEEE Transactions on Acoustic, Speech Signal Processing, 33(2), 443–445.         [ Links ]

7. Gouvea, E.–B., & Stern, R.–M. (1997). Speaker Normalization Through Formant–Based Warping Of The Frequency Scale. 5th European Conference on Speech Communication and Technology (EUROSPEECH 1997), Rhodes, Greece, 1139–1142.         [ Links ]

8. Harada, S., Landay, J.–A., Malkin, J., Li, X. & Bilmes, J.A. (2008). The Vocal Joystick: Evaluation of Voice–Based Cursor Control Techniques for Assistive Technology. Disability and Rehabilitation: Assistive Technology, 3(1), 22–34.         [ Links ]

9. Hawley, M., Enderby, P., Green, P., Brownsell, S., Hatzis, A., Parker, M., Carmichael, J., Cunningham, S., O'Neill, P. & Palmer, R. (2003). STARDUST: Speech Training And Recognition for Dysarthic Users of Assistive Technology. 7th European Conference for the Advancement of Assistive Technology (AAATE 2003), Dublin, Ireland, 959–963.         [ Links ]

10. Iturrate, I., Antelis, J.–M., Kübler, A. & Mínguez, J. (2009). A Non–Invasive Brain–Actuated Wheelchair based on a P300 Neurophysiological Protocol and Automated Navigation. IEEE Transactions on Robotics, 25(3), 614–627.         [ Links ]

11. Lee, L. & Rose, R. (1998). A Frequency Warping Approach to Speaker Normalization. IEEE Transactions on Speech and Audio Processing, 6(1), 49–60.         [ Links ]

12. Palomo, P., González, T., Rivas, R., Irazabal I. & Ruiz, A. (2009). IRISCOM. Proyecto IRIS. IV Jornadas Iberoamericanas de Tecnologías de Apoyo a Discapacidad, Las tecnologías de apoyo en parálisis cerebral, Madrid, Spain, 87–91.         [ Links ]

13. Rodríguez, W.–R. & Lleida, E. (2009). Formant Estimation in Children's Speech and its application for a Spanish Speech Therapy Tool. 2009 Workshop on Speech and Language Technologies in Education (SLaTE 2009), Wroxall Abbey Estates, Warwickshire, United Kingdom, 81–84.         [ Links ]

14. Rodríguez, W.–R., Saz, O., Miguel, A. & Lleida, E. (2010). On Line Vocal Tract Length Estimation for Speaker Normalization in Speech Recognition. VI Jornadas en Tecnología del Habla and II Iberian SLTech Workshop (Fala 2010), Vigo, Spain, 119–122.         [ Links ]

15. Sánchez Ortega, P.–L., Cámara Nebreda, J.–M. & Núñez Angulo, B. (2009). Interacción Con Ordenador Mediantes Dispositivos Inalámbricos Para Usuarios Con Movilidad Muy Reducida. Colaboración Universidad de Burgos – APACE. IV Jornadas Iberoamericanas de Tecnologías de Apoyo a Discapacidad, Las tecnologías de apoyo en parálisis cerebral, Madrid, Spain, 45–49.         [ Links ]

16. Sanders, E., Ruiter, M., Beijer, L. & Strik, H. (2002). Automatic Recognition of Dutch Dysarthric Speech: A Pilot Study. 7th International Conference on Spoken Language Processing (INTERSPEECH 2002), Denver CO, USA, 661–664.         [ Links ]

17. Saz, O., Lleida, E., Abarca, L. & Mejuto, S. (2009). Mouseclick: Acceso Al Ordenador A Través De La Voz. IV Jornadas Iberoamericanas de Tecnologías de Apoyo a Discapacidad, Las tecnologías de apoyo en parálisis cerebral, Madrid, Spain, 63–67.         [ Links ]

18. Saz, O., Yin, S.–C., Lleida, E., Rose, R., Vaquero, C. & Rodríguez, W.–R. (2009). Tools and Technologies for Computer–Aided Speech and Language Therapy. Speech Communication, 51(10), 948–96.         [ Links ]

19. Rabiner, L.–R. & Schafer, R.–W. (1978). Digital Processing of Speech Signals. Englewood Cliffs NJ, USA: Prentice Hall.         [ Links ]

20. Rahman, M.–S. & Shimamura, T. (2005). Formant Frequency Estimation of High–Pitched Speech by Homomorphic Prediction. Acoustical Science and Technology, 26(6), 502–510.         [ Links ]

21. Sharma, H.–V. & Hasegawa–Johnson, M., (2009). Universal Access: Speech Recognition for Talkers with Spastic Dysarthria. 10th Annual Conference of the International Speech Communication Association (INTERSPEECH 2009), Brighton, United Kingdom, 1451–1454        [ Links ]

22. Varona, J., Manresa–Yee, C. & Perales, F.–J. (2008). Hands–Free Vision–Based Interface for Computer Accessibility, Journal of Network and Computer Applications, 31(4), 357–374.         [ Links ]

23. Wakita, H. (1977). Normalization of Vowels by Vocal Tract Length and its Application to Vowel Identification. IEEE Transactions on Acoustic, Speech Signal Processing, 25(2), 183–192.         [ Links ]