SciELO - Scientific Electronic Library Online

 
 número43Examining the Validity of Cross-Lingual Word Sense DisambiguationLow Cost Construction of a Multilingual Lexicon from Bilingual Lists índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Polibits

versión On-line ISSN 1870-9044

Resumen

TURCHI, Marco  y  EHRMANN, Maud. Knowledge Expansion of a Statistical Machine Translation System using Morphological Resources. Polibits [online]. 2011, n.43, pp. 37-43. ISSN 1870-9044.

Translation capability of a Phrase-Based Statistical Machine Translation (PBSMT) system mostly depends on parallel data and phrases that are not present in the training data are not correctly translated. This paper describes a method that efflciently expands the existing knowledge of a PBSMT system without adding more parallel data but using external morphological resources. A set of new phrase associations is added to translation and reordering models; each of them corresponds to a morphological variation of the source/target/both phrases of an existing association. New associations are generated using a string similarity score based on morphosyntactic information. We tested our approach on En-Fr and Fr-En translations and results showed improvements of the performance in terms of automatic scores (BLEU and Meteor) and reduction of out-of-vocabulary (OOV) words. We believe that our knowledge expansion framework is generic and could be used to add different types of information to the model.

Palabras llave : Machine translation; knowledge; morphological resources.

        · texto en Inglés     · pdf en Inglés