Summary Evaluation with and without References

Torres-Moreno, Juan-Manuel; Saggion, Horacio; Cunha, Iria da; SanJuan, Eric; Velázquez-Morales, Patricia

Serviços Personalizados

Journal

Artigo

Indicadores

Citado por SciELO
Acessos

Links relacionados

Similares em SciELO

Mais
Mais

Permalink

Polibits

versão On-line ISSN 1870-9044

Polibits no.42 México Jul./Dez. 2010

Summary Evaluation with and without References

Juan–Manuel Torres–Moreno¹, Horacio Saggion³, Iria da Cunha⁴, Eric SanJuan², and Patricia Velázquez–Morales⁵

¹ LIA/Université d'Avignon, France and École Polytechnique de Montreal, Canada. (juan–manuel.torres@univ–avignon.fr).

² LIA/Université d'Avignon, France. (eric.sanjuan@univ–avignon.fr).

³ DTIC/Universitat Pompeu Fabra, Spain. (horacio.saggion@upf.edu).

⁴ IULA/Universitat Pompeu Fabra, Spain; LIA/Université d'Avignon, France and Instituto de Ingeniería/UNAM, Mexico. (iria.dacunha@upf.edu).

⁵ VM Labs, France. (patricia_velazquez@yahoo.com).

Manuscript received June 8, 2010.
Manuscript accepted for publication July 25, 2010.

Abstract

We study a new content–based method for the evaluation of text summarization systems without human models which is used to produce system rankings. The research is carried out using a new content–based evaluation framework called Fresa to compute a variety of divergences among probability distributions. We apply our comparison framework to various well–established content–based evaluation measures in text summarization such as COVERAGE, RESPONSIVENESS, PYRAMIDS and ROUGE studying their associations in various text summarization tasks including generic multi–document summarization in English and French, focus–based multi–document summarization in English and generic single–document summarization in French and Spanish.

Key words: Text summarization evaluation, content–based evaluation measures, divergences.

DESCARGAR ARTÍCULO EN FORMATO PDF

ACKNOWLEDGMENT

We are grateful to the Programa Ramon y Cajal from Ministerio de Ciencia e Innovación, Spain. This work is partially supported by: a postdoctoral grant from the National Program for Mobility of Research Human Resources (National Plan of Scientific Research, Development and Innovation 2008–2011, Ministerio de Ciencia e Innovation, Spain); the research project CONACyT, number 82050, and the research project PAPIIT–DGAPA (Universidad Nacional Autónoma de Mexico), number IN403108.

REFERENCES

[1] I. Mani, G. Klein, D. House, L. Hirschman, T. Firmin, and B. Sundheim, "Summac: a text summarization evaluation," Natural Language Engineering, vol. 8, no. 1, pp. 43–68, 2002. [ Links ]

[2] P. Over, H. Dang, and D. Harman, "DUC in context," IPM, vol. 43, no. 6, pp. 1506–1520, 2007. [ Links ]

[3] Proceedings of the Text Analysis Conference. Gaithesburg, Maryland, USA: NIST, November 17–19 2008. [ Links ]

[4] K. Spärck Jones and J. Galliers, Evaluating Natural Language Processing Systems, An Analysis and Review, ser. Lecture Notes in Computer Science. Springer, 1996, vol. 1083. [ Links ]

[5] R. L. Donaway, K. W. Drummey, and L. A. Mather, "A comparison of rankings produced by summarization evaluation measures," in NAACL Workshop on Automatic Summarization, 2000, pp. 69–78. [ Links ]

[6] H. Saggion, D. Radev, S. Teufel, and W. Lam, "Meta–evaluation of Summaries in a Cross–lingual Environment using Content–based Metrics," in COLING 2002, Taipei, Taiwan, August 2002, pp. 849–855. [ Links ]

[7] D. R. Radev, S. Teufel, H. Saggion, W. Lam, J. Blitzer, H. Qi, A. Çelebi, D. Liu, and E. Drábek, "Evaluation challenges in large–scale document summarization," in ACL'03, 2003, pp. 375–382. [ Links ]

[8] K. Papineni, S. Roukos, T. Ward, , and W. J. Zhu, "BLEU: a method for automatic evaluation of machine translation," in ACL 02, 2002, pp. 311–318. [ Links ]

[9] K. Pastra and H. Saggion, "Colouring summaries BLEU," in Evaluation Initiatives in Natural Language Processing. Budapest, Hungary: EACL, 14 April 2003. [ Links ]

[10] C.–Y. Lin, "ROUGE: A Package for Automatic Evaluation of Summaries," in Text Summarization Branches Out: ACL–04 Workshop, M.–F. Moens and S. Szpakowicz, Eds., Barcelona, July 2004, pp. 74–81. [ Links ]

[11] A. Nenkova and R. J. Passonneau, "Evaluating Content Selection in Summarization: The Pyramid Method," in HLT–NAACL, 2004, pp. 145–152. [ Links ]

[12] A. Louis and A. Nenkova, "Automatically Evaluating Content Selection in Summarization without Human Models," in Empirical Methods in Natural Language Processing, Singapore, August 2009, pp. 306–314. [Online]. Available: http://www.aclweb.org/anthology/D7D09/D09–1032 [ Links ]

[13] J. Lin, "Divergence Measures based on the Shannon Entropy," IEEE Transactions on Information Theory, vol. 37, no. 145–151, 1991. [ Links ]

[14] C.–Y. Lin and E. Hovy, "Automatic Evaluation of Summaries Using N–gram Co–occurrence Statistics," in HLT–NAACL. Morristown, NJ, USA: Association for Computational Linguistics, 2003, pp. 71–78. [ Links ]

[15] C.–Y. Lin, G. Cao, J. Gao, and J.–Y. Nie, "An information–theoretic approach to automatic evaluation of summaries," in HLT–NAACL, Morristown, USA, 2006, pp. 463–470. [ Links ]

[16] S. Kullback and R. Leibler, "On information and sufficiency," Ann. of Math. Stat., vol. 22, no. 1, pp. 79–86, 1951. [ Links ]

[17] S. Siegel and N. Castellan, Nonparametric Statistics for the Behavioral Sciences. McGraw–Hill, 1998. [ Links ]

[18] C. de Loupy, M. Guégan, C. Ayache, S. Seng, and J.–M. Torres–Moreno, "A French Human Reference Corpus for multi–documents summarization and sentence compression," in LREC'10, vol. 2, Malta, 2010, p. In press. [ Links ]

[19] S. Fernández, E. SanJuan, and J.–M. Torres–Moreno, "Textual Energy of Associative Memories: performants applications of Enertex algorithm in text summarization and topic segmentation," in MICAI'07, 2007, pp. 861–871. [ Links ]

[20] J.–M. Torres–Moreno, P. Velázquez–Morales, and J.–G. Meunier, "Condensés de textes par des méthodes numériques," in JADT'02, vol. 2, St Malo, France, 2002, pp. 723–734. [ Links ]

[21] J. Vivaldi, I. da Cunha, J.–M. Torres–Moreno, and P. Velázquez–Morales, "Automatic summarization using terminological and semantic resources," in LREC'10, vol. 2, Malta, 2010, p. In press. [ Links ]

[22] J.–M. Torres–Moreno and J. Ramirez, "REG : un algorithme glouton appliqué au résumé automatique de texte," in JADT'10. Rome, 2010, p. In press. [ Links ]

[23] V. Yatsko and T. Vishnyakov, "A method for evaluating modern systems of automatic text summarization," Automatic Documentation and Mathematical Linguistics, vol. 41, no. 3, pp. 93–103, 2007. [ Links ]

[24] C. D. Manning and H. Schütze, Foundations of Statistical Natural Language Processing. Cambridge, Massachusetts: The MIT Press, 1999. [ Links ]

[25] K. Spärck Jones, "Automatic summarising: The state of the art," IPM, vol. 43, no. 6, pp. 1449–1481, 2007. [ Links ]

[26] I. da Cunha, L. Wanner, and M. T. Cabré, "Summarization of specialized discourse: The case of medical articles in spanish," Terminology, vol. 13, no. 2, pp. 249–286, 2007. [ Links ]

[27] C.–K. Chuah, "Types of lexical substitution in abstracting," in ACL Student Research Workshop. Toulouse, France: Association for Computational Linguistics, 9–11 July 2001 2001, pp. 49–54. [ Links ]

[28] K. Owkzarzak and H. T. Dang, "Evaluation of automatic summaries: Metrics under varying data conditions," in UCNLG+Sum '09, Suntec, Singapore, August 2009, pp. 23–30. [ Links ]

[29] K. Knight and D. Marcu, "Statistics–based summarization–step one: Sentence compression," in Proceedings of the National Conference on Artificial Intelligence. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999, 2000, pp. 703–710. [ Links ]