SciELO - Scientific Electronic Library Online

 
 issue47TopicSearch-Personalized Web Clustering Engine Using Semantic Query Expansion, Memetic Algorithms and Intelligent AgentsScene Boundary Detection from Movie Dialogue: A Genetic Algorithm Approach author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • Have no similar articlesSimilars in SciELO

Share


Polibits

On-line version ISSN 1870-9044

Polibits  n.47 México Jan./Jul. 2013

 

Recommending Machine Translation Output to Translators by Estimating Translation Effort: A Case Study

 

Prashant Mathur1, Nick Ruiz1, and Marcello Federico2

 

1 University of Trento and FBK, Italy.

2 FBK, Italy.

 

Manuscript received on December 7, 2012.
Accepted for publication on January 11, 2013.

 

Abstract

In this paper we use the statistics provided by a field experiment to explore the utility of supplying machine translation suggestions in a computer-assisted translation (CAT) environment. Regression models are trained for each user in order to estimate the time to edit (TTE) for the current translation segment. We use a combination of features from the current segment and aggregated features from formerly translated segments selected with content-based filtering approaches commonly used in recommendation systems. We present and evaluate decision function heuristics to determine if machine translation output will be useful for the translator in the given segment. We find that our regression models do a reasonable job for some users in predicting TTE given only a small number of training examples; although noise in the actual TTE for seemingly similar segments yields large error margins. We propose to include the estimation of TTE in CAT recommendation systems as a well-correlated metric for translation quality.

Key words: Machine translation, computer-assisted translation, quality estimation, recommender systems.

 

DESCARGAR ARTÍCULO EN FORMATO PDF

 

ACKNOWLEDGMENTS

This work is partially funded by the European Commission under the FP7 project MateCat, Grant 287688. The authors wish to thank Georgia Koutrika for her valuable suggestions in this experiment.

 

REFERENCES

[1] M. Federico, A. Cattelan, and M. Trombetti, "Measuring User Productivity in Machine Translation Enhanced Computer Assisted Translation ," in AMTA 2012, San Diego, California, October 2012.         [ Links ]

[2] L. Specia, M. Turchi, Z. Wang, J. Shawe-Taylor, and C. Saunders, "Improving the confidence of machine translation quality estimates," in Machine Translation Summit XII, Ottawa, Canada, 2009.         [ Links ]

[3] C. Buck, "Black box features for the WMT 2012 quality estimation shared task," in Proceedings of the Seventh Workshop on Statistical Machine Translation. Montreal, Canada: Association for Computational Linguistics, June 2012.         [ Links ]

[4] C.-Y. Lin and F. J. Och, "Orange: a method for evaluating automatic evaluation metrics for machine translation," in Proceedings of Coling 2004. Geneva, Switzerland: COLING, Aug 23-Aug 27 2004, pp. 501-507.         [ Links ]

[5] R. Soricut, N. Bach, and Z. Wang, "The SDL Language Weaver Systems in the WMT12 Quality Estimation Shared Task," in Proceedings of the Seventh Workshop on Statistical Machine Translation. Montréal, Canada: Association for Computational Linguistics, June 2012, pp. 145-151. [Online], Available: http://www.aclweb.org/anthology/W12-3118        [ Links ]

[6] R. Soricut and A. Echihabi, "TrustRank: Inducing Trust in Automatic Translations via Ranking," in ACL, 2010, pp. 612-621.         [ Links ]

[7] N. Bach, F. Huang, and Y. Al-Onaizan, "Goodness: a method for measuring machine translation confidence," in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, ser. HLT '11. Stroudsburg, PA, USA: Association for Computational Linguistics, 2011, pp. 211-219. [Online], Available: http://dl.acm.org/citation.cfm?id=2002472.2002500        [ Links ]

[8] Y. He, Y. Ma, J. van Genabith, and A. Way, "Bridging SMT and TM with Translation Recommendation," in Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Uppsala, Sweden: Association for Computational Linguistics, July 2010, pp. 622-630. [Online], Available: http://www.aclweb.org/anthology/P10-1064        [ Links ]

[9] M. Snover, B. Dorr, R. Schwartz, L. Micciulla, and J. Makhoul, "A study of translation edit rate with targeted human annotation," in In Proceedings of Association for Machine Translation in the Americas, 2006, pp. 223-231.         [ Links ]

[10] P. Koehn, H. Hoang, A. Birch, C. Callison-Burch, M. Federico, N. Bertoldi, B. Cowan, W. Shen, C. Moran, R. Zens, C. Dyer, O. Bojar, A. Constantin, and E. Herbst, "Moses: Open source toolkit for statistical machine translation," in ACL, 2007.         [ Links ]

[11] PR Brown, S. A. Delia Pietra, V. J. Delia Pietra, and R. L. Mercer, "The mathematics of statistical machine translation: Parameter estimation,"Computational Linguistics, vol. 19, no. 2, pp. 263-312, 1993. [Online] ,Available: http://aclweb.Org/anthology-new/J/J93/J93-2003.pdf        [ Links ]

[12] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H.Witten, "The weka data mining software: an update," SIGKDD Explor.Newsl., vol. 11, no. 1, pp. 10-18, Nov. 2009. [Online], Available: http://doi.acm.org/10.1145/1656274.1656278        [ Links ]

[13] J. R. Quinlan, "Learning with continuous classes," in Proceedings of the 5th Australian Joint Conference on Artificial Intelligence. World Scientific, 1992, pp. 343-348.         [ Links ]

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License