SciELO - Scientific Electronic Library Online

 
vol.23 número3ReIEmb: A Relevance-based Application Embedding for Mobile App Retrieval and CategorizationSentence Generation Using Selective Text Prediction índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

  • No hay artículos similaresSimilares en SciELO

Compartir


Computación y Sistemas

versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546

Resumen

HARRAT, Salima; MEFTOUH, Karima  y  SMAILI, Kamel. Script Independent Morphological Segmentation for Arabic Maghrebi Dialects: An Application to Machine Translation. Comp. y Sist. [online]. 2019, vol.23, n.3, pp.979-989.  Epub 09-Ago-2021. ISSN 2007-9737.  https://doi.org/10.13053/cys-23-3-3267.

This research deals with resources creation for under-resourced languages. We try to adapt existing resources for other resourced-languages to process less-resourced ones. We focus on Arabic dialects of the Maghreb, namely Algerian, Moroccan and Tunisian. We first adapt a well-known statistical word segmenter to segment Algerian dialect texts written in both Arabic and Latin scripts. We demonstrate that unsupervised morphological segmentation could be applied to Arabic dialects regardless of used script. Next, we use this kind of segmentation to improve statistical machine translation scores between the tree Maghrebi dialects and French. We use a parallel multidialectal corpus that includes six Arabic dialects in addition to MSA and French. We achieved interesting results. Regards to word segmentation, the rate of correctly segmented words reached 70% for those written in Latin script and 79% for those written in Arabic script. For machine translation, the unsupervised morphological segmentation helped to decrease out-of-vocabulary words rates by a minimum of 35%.

Palabras llave : Arabic dialects; morphological segmentation; machine translation.

        · texto en Inglés     · Inglés ( pdf )