Spoken Language Identification for Short Utterance with Transfer Learning

Montalvo-Bereau, Ana; Calvo-de-Lara, José Ramón; Hernández-Sierra, Gabriel; Reyes-Díaz, Flavio

doi:10.13053/cys-28-3-5180

Servicios Personalizados

Revista

Articulo

Indicadores

Citado por SciELO
Accesos

Links relacionados

Similares en SciELO

Permalink

Computación y Sistemas

versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546

Resumen

MONTALVO-BEREAU, Ana; CALVO-DE-LARA, José Ramón; HERNANDEZ-SIERRA, Gabriel y REYES-DIAZ, Flavio. Spoken Language Identification for Short Utterance with Transfer Learning. Comp. y Sist. [online]. 2024, vol.28, n.3, pp.1487-1497. Epub 21-Ene-2025. ISSN 2007-9737. https://doi.org/10.13053/cys-28-3-5180.

Spoken language recognition is a research field that has received considerable attention due to its impact on several tasks related to multilingual speech processing. While it has been demonstrated that the use of contextual and auxiliary task information can enhance the results within this field, this avenue has not been fully explored. In the present work, we propose to address the spoken language recognition task in short utterances by considering two speech-related tasks as auxiliaries in a multi-tasking architecture. The primary task was language recognition, with sex and speaker identity serving as auxiliary tasks. Three models from disparate approaches were implemented and trained in a single-task and multi-task learning paradigm. The models considered were 2D-CNN based, one of which was a proposed configuration designed to address less than a second utterances. The experiments were conducted on a subset of the VoxForge corpus, with a markedly limited amount of signals. The results demonstrate that the spoken language recognition task benefits from multi-task learning by using sex and speaker identity as auxiliary tasks over three different models.

Palabras llave : Spoken language recognition; deep learning; transfer learning; multi-task learning.

· texto en Inglés · Inglés (

pdf )