SciELO - Scientific Electronic Library Online

 issue42LG-PACKAGE: New FrontierThe Role of Automation in Instruction author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand




Related links

  • Have no similar articlesSimilars in SciELO



On-line version ISSN 1870-9044

Polibits  n.42 México Jul./Dec. 2010


Summary Evaluation with and without References


Juan–Manuel Torres–Moreno1, Horacio Saggion3, Iria da Cunha4, Eric SanJuan2, and Patricia Velázquez–Morales5


1 LIA/Université d'Avignon, France and École Polytechnique de Montreal, Canada. (juan–manuel.torres@univ–

2 LIA/Université d'Avignon, France. (eric.sanjuan@univ–

3 DTIC/Universitat Pompeu Fabra, Spain. (

4 IULA/Universitat Pompeu Fabra, Spain; LIA/Université d'Avignon, France and Instituto de Ingeniería/UNAM, Mexico. (

5 VM Labs, France. (


Manuscript received June 8, 2010.
Manuscript accepted for publication July 25, 2010.



We study a new content–based method for the evaluation of text summarization systems without human models which is used to produce system rankings. The research is carried out using a new content–based evaluation framework called Fresa to compute a variety of divergences among probability distributions. We apply our comparison framework to various well–established content–based evaluation measures in text summarization such as COVERAGE, RESPONSIVENESS, PYRAMIDS and ROUGE studying their associations in various text summarization tasks including generic multi–document summarization in English and French, focus–based multi–document summarization in English and generic single–document summarization in French and Spanish.

Key words: Text summarization evaluation, content–based evaluation measures, divergences.





We are grateful to the Programa Ramon y Cajal from Ministerio de Ciencia e Innovación, Spain. This work is partially supported by: a postdoctoral grant from the National Program for Mobility of Research Human Resources (National Plan of Scientific Research, Development and Innovation 2008–2011, Ministerio de Ciencia e Innovation, Spain); the research project CONACyT, number 82050, and the research project PAPIIT–DGAPA (Universidad Nacional Autónoma de Mexico), number IN403108.



[1] I. Mani, G. Klein, D. House, L. Hirschman, T. Firmin, and B. Sundheim, "Summac: a text summarization evaluation," Natural Language Engineering, vol. 8, no. 1, pp. 43–68, 2002.         [ Links ]

[2] P. Over, H. Dang, and D. Harman, "DUC in context," IPM, vol. 43, no. 6, pp. 1506–1520, 2007.         [ Links ]

[3] Proceedings of the Text Analysis Conference. Gaithesburg, Maryland, USA: NIST, November 17–19 2008.         [ Links ]

[4] K. Spärck Jones and J. Galliers, Evaluating Natural Language Processing Systems, An Analysis and Review, ser. Lecture Notes in Computer Science. Springer, 1996, vol. 1083.         [ Links ]

[5] R. L. Donaway, K. W. Drummey, and L. A. Mather, "A comparison of rankings produced by summarization evaluation measures," in NAACL Workshop on Automatic Summarization, 2000, pp. 69–78.         [ Links ]

[6] H. Saggion, D. Radev, S. Teufel, and W. Lam, "Meta–evaluation of Summaries in a Cross–lingual Environment using Content–based Metrics," in COLING 2002, Taipei, Taiwan, August 2002, pp. 849–855.         [ Links ]

[7] D. R. Radev, S. Teufel, H. Saggion, W. Lam, J. Blitzer, H. Qi, A. Çelebi, D. Liu, and E. Drábek, "Evaluation challenges in large–scale document summarization," in ACL'03, 2003, pp. 375–382.         [ Links ]

[8] K. Papineni, S. Roukos, T. Ward, , and W. J. Zhu, "BLEU: a method for automatic evaluation of machine translation," in ACL 02, 2002, pp. 311–318.         [ Links ]

[9] K. Pastra and H. Saggion, "Colouring summaries BLEU," in Evaluation Initiatives in Natural Language Processing. Budapest, Hungary: EACL, 14 April 2003.         [ Links ]

[10] C.–Y. Lin, "ROUGE: A Package for Automatic Evaluation of Summaries," in Text Summarization Branches Out: ACL–04 Workshop, M.–F. Moens and S. Szpakowicz, Eds., Barcelona, July 2004, pp. 74–81.         [ Links ]

[11] A. Nenkova and R. J. Passonneau, "Evaluating Content Selection in Summarization: The Pyramid Method," in HLT–NAACL, 2004, pp. 145–152.         [ Links ]

[12] A. Louis and A. Nenkova, "Automatically Evaluating Content Selection in Summarization without Human Models," in Empirical Methods in Natural Language Processing, Singapore, August 2009, pp. 306–314. [Online]. Available:–1032         [ Links ]

[13] J. Lin, "Divergence Measures based on the Shannon Entropy," IEEE Transactions on Information Theory, vol. 37, no. 145–151, 1991.         [ Links ]

[14] C.–Y. Lin and E. Hovy, "Automatic Evaluation of Summaries Using N–gram Co–occurrence Statistics," in HLT–NAACL. Morristown, NJ, USA: Association for Computational Linguistics, 2003, pp. 71–78.         [ Links ]

[15] C.–Y. Lin, G. Cao, J. Gao, and J.–Y. Nie, "An information–theoretic approach to automatic evaluation of summaries," in HLT–NAACL, Morristown, USA, 2006, pp. 463–470.         [ Links ]

[16] S. Kullback and R. Leibler, "On information and sufficiency," Ann. of Math. Stat., vol. 22, no. 1, pp. 79–86, 1951.         [ Links ]

[17] S. Siegel and N. Castellan, Nonparametric Statistics for the Behavioral Sciences. McGraw–Hill, 1998.         [ Links ]

[18] C. de Loupy, M. Guégan, C. Ayache, S. Seng, and J.–M. Torres–Moreno, "A French Human Reference Corpus for multi–documents summarization and sentence compression," in LREC'10, vol. 2, Malta, 2010, p. In press.         [ Links ]

[19] S. Fernández, E. SanJuan, and J.–M. Torres–Moreno, "Textual Energy of Associative Memories: performants applications of Enertex algorithm in text summarization and topic segmentation," in MICAI'07, 2007, pp. 861–871.         [ Links ]

[20] J.–M. Torres–Moreno, P. Velázquez–Morales, and J.–G. Meunier, "Condensés de textes par des méthodes numériques," in JADT'02, vol. 2, St Malo, France, 2002, pp. 723–734.         [ Links ]

[21] J. Vivaldi, I. da Cunha, J.–M. Torres–Moreno, and P. Velázquez–Morales, "Automatic summarization using terminological and semantic resources," in LREC'10, vol. 2, Malta, 2010, p. In press.         [ Links ]

[22] J.–M. Torres–Moreno and J. Ramirez, "REG : un algorithme glouton appliqué au résumé automatique de texte," in JADT'10. Rome, 2010, p. In press.         [ Links ]

[23] V. Yatsko and T. Vishnyakov, "A method for evaluating modern systems of automatic text summarization," Automatic Documentation and Mathematical Linguistics, vol. 41, no. 3, pp. 93–103, 2007.         [ Links ]

[24] C. D. Manning and H. Schütze, Foundations of Statistical Natural Language Processing. Cambridge, Massachusetts: The MIT Press, 1999.         [ Links ]

[25] K. Spärck Jones, "Automatic summarising: The state of the art," IPM, vol. 43, no. 6, pp. 1449–1481, 2007.         [ Links ]

[26] I. da Cunha, L. Wanner, and M. T. Cabré, "Summarization of specialized discourse: The case of medical articles in spanish," Terminology, vol. 13, no. 2, pp. 249–286, 2007.         [ Links ]

[27] C.–K. Chuah, "Types of lexical substitution in abstracting," in ACL Student Research Workshop. Toulouse, France: Association for Computational Linguistics, 9–11 July 2001 2001, pp. 49–54.         [ Links ]

[28] K. Owkzarzak and H. T. Dang, "Evaluation of automatic summaries: Metrics under varying data conditions," in UCNLG+Sum '09, Suntec, Singapore, August 2009, pp. 23–30.         [ Links ]

[29] K. Knight and D. Marcu, "Statistics–based summarization–step one: Sentence compression," in Proceedings of the National Conference on Artificial Intelligence. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999, 2000, pp. 703–710.         [ Links ]

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License