SciELO - Scientific Electronic Library Online

 
 número44Automatic Music Composition with Simple Probabilistic Generative GrammarsIdentifying the User's Intentions: Basic Illocutions in Modern Greek índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Polibits

versão On-line ISSN 1870-9044

Polibits  no.44 México Jul./Dez. 2011

 

An Approach to Cross–Lingual Textual Entailment using Online Machine Translation Systems

 

Julio Castillo1 and Marina Cárdenas2

 

1 National University of Cordoba – FaMAF, Cordoba, Argentina and also with the National Technological University–Regional Faculty of Cordoba, Argentina (email: jotacastillo@gmail.com).

2 National Technological University–Regional Faculty of Cordoba, Argentina (email: ing.marinacardenas@gmail.com).

 

Manuscript received July 1, 2011.
Manuscript accepted for publication October 2, 2011.

 

Abstract

In this paper, we show an approach to cross–lingual textual entailment (CLTE) by using machine translation systems such as Bing Translator and Google Translate. We experiment with a wide variety of data sets to the task of textual Entailment (TE) and evaluate the contribution of an algorithm that expands a monolingual TE corpus that seems promising for the task of CLTE. We built a CLTE corpus and we report a procedure that can be used to create a CLTE corpus in any pair of languages. We also report the results obtained in our experiments with the three–way classification task for CLTE and we show that this result outperform the average score of RTE (Recognizing Textual Entailment) systems. Finally, we find that using WordNet as the only source of lexical–semantic knowledge it is possibly to build a system for CLTE, which achieves comparable results with the average score of RTE systems for both two–way and three–way tasks.

Key words: Cross–lingual textual entailment, textual entailment, WordNet, bilingual textual entailment corpus.

 

DESCARGAR ARTÍCULO EN FORMATO PDF

 

REFERENCES

[1] L. Bentivogli, I. Dagan, H. Dang, D. Giampiccolo, and B. Magnini, "The Fifth PASCAL RTE Challenge," in Proceedings of the Text Analysis Conference, 2009.         [ Links ]

[2] Y. Mehdad, M. Negri, and M. Federico, "Towards Cross–Lingual Textual entailment," in Proceedings of the 11th NAACL HLT, 2010.         [ Links ]

[3] L. Bentivogli, P. Clark, I. Dagan, H. Dang, D. Giampiccolo, "The Sixth Pascal Recognizing Textual Entailment Challenge," in Proceedings of Textual Analysis Conference, NIST, Maryland USA, 2010.         [ Links ]

[4] R. Richardson and A. Smeaton, "Using WordNet in a Knowledge–Based Approach to Information Retrieval," Techn. Report Working Paper: CA–0395, Dublin City University, Dublin, Ireland, 1995.         [ Links ]

[5] J. Marlow, P. Clough, J. Recuero, and J. Artiles, "Exploring the Effects of Language Skills on Multilingual Web Search," in Proceedings of the 30th European Conference on IR Research (ECIR'08), Glasgow, UK. LNCS, Volume 4956, Springer, Heidelberg, 2008, pp. 126–137.         [ Links ]

[6] J. Lilleng and S. Tomassen, "Cross–lingual information retrieval by feature vectors", NLDB 2007, LNCS, pp. 229–239, 2007.         [ Links ]

[7] D. Giampiccolo, B. Magnini, I. Dagan, and B. Dolan, "The Third PASCAL Recognizing Textual Entailment Challenge," in Proceedings of the ACL–PASCAL Workshop on Textual Entailment and Paraphrasing, Prague, Czech Republic, 2007.         [ Links ]

[8] J. Landis and G. Koch, "The measurements of observer agreement for categorical data," Biometrics, 33:159–174, 1997.         [ Links ]

[9] J. Castillo, "Using Machine Translation to expand a Corpus in Textual Entailment," in Proceedings of the 7th ICETAL, Reykjavik, Iceland. LNCS, vol. 6233, Springer, Heidelberg, 2010, pp. 97–102.         [ Links ]

[10] J. Castillo, "A Semantic Oriented Approach to Textual Entailment using WordNet–based Measures," in Proceedings of the MICAI2010, Pachuca, Mexico, LNCS, vol. 6437, Springer, Heidelberg, 2010, pp. 44–55.         [ Links ]

[11] P. Resnik, "Information Content to Evaluate Semantic Similarity in a Taxonomy," in Proceedings of IJCAI1995, 1995, pp. 448–453.         [ Links ]

[12] D. Lin, "An Information–Theoretic Definition of Similarity," in Proceedings of Conference on Machine Learning, 1997, pp. 296–304.         [ Links ]

[13] J. Jiang and D. Conrath, "Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy," in Proceedings of the ROCLING X, 1997.         [ Links ]

[14] G. Pirrò and N. Seco, "Design, Implementation and Evaluation of a New Similarity Metric Combining Feature and Intrinsic Information Content," ODBASE 2008, Springer LNCS, 2008.         [ Links ]

[15] Z. Wu and M. Palmer, "Verb semantics and lexical selection," in Proceedings of the 32ndACL, 1994.         [ Links ]

[16] C. Leacock and M. Chodorow, "Combining local context and WordNet similarity for word sense identification," in WordNet: An Electronic Lexical Database, MIT Press, pp. 265–283, 1998.         [ Links ]

[17] D. Giampiccolo, B. Magnini, I. Dagan, and B. Dolan, "The Third PASCAL Recognizing Textual Entailment Challenge," in Proceedings of the ACL–PASCAL Workshop on Textual Entailment and Paraphrasing, Prague, Czech Republic, 2007.         [ Links ]

[18] T. Pedersen, S. Patwardhan, and J. Michelizzi, "WordNet::Similarity – Measuring the Relatedness of Concepts," in Proceedings of the AAAI–04, 2004.         [ Links ]

[19] C. Quirk, C. Brockett, and W. Dolan, "Monolingual Machine Translation for Paraphrase Generation," in Proceedings of the ACL–HLT, 2004.         [ Links ]