Iterative Feedback Based Manifold-Ranking for Update Summary

Ruifang, He; Bing, Qin; Ting, Liu; Yang, Liu; Sheng, Li

Services on Demand

Journal

Article

Indicators

Cited by SciELO
Access statistics

Polibits

On-line version ISSN 1870-9044

Polibits n.37 México Jan./Jun. 2008

Special section: natural language processing

Iterative Feedback Based Manifold–Ranking for Update Summary

He Ruifang, Qin Bing, Liu Ting, Liu Yang, and Li Sheng

Information Retrieval Lab, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 15001, China (phone: +86–451–86413683–801; fax: +86–451–86413683–812; e–mail: rfhe@ir.hit.edu.cn).

Manuscript received May 10, 2008.
Manuscript accepted for publication June 20, 2008.

Abstract

The update summary as defined for the DUC2007 new task aims to capture evolving information of a single topic over time. It delivers focused information to a user who has already read a set of older documents covering the same topic. This paper presents a novel manifold–ranking frame based on iterative feedback mechanism to this summary task. The topic set is extended by using the summarization of previous timeslices and the first sentences of documents in current timeslice. Iterative feedback mechanism is applied to model the dynamically evolving characteristic and represent the relay propagation of information in temporally evolving data. Modified manifold–ranking process also can naturally make use of both the relationships among all the sentences in the documents and relationships between the topic and the sentences. The ranking score for each sentence obtained in the manifold–ranking process denotes the importance of sentence biased towards topic, and then the greedy algorithm is employed to rerank the sentences for removing the redundant information. The summary is produced by choosing the sentences with high ranking score. Experiments on dataset of DUC2007 update task demonstrate the encouraging performance of the proposed approach.

Key words: Temporal multi–document summarization, update summary, iterative feedback based manifold–ranking.

DESCARGAR ARTÍCULO EN FORMATO PDF

REFERENCES

[1] Allan, R. Gupta, and V. Khandelwal. Temporal summaries of new topics. Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 10–18, 2001. K. R. Andrew Hickl and F. Lacatusu. LCC's GISTexter at DUC 2007: Machine Reading for Update Summarization. Proceedings of the DUC2007. [ Links ]

[2] K. R. Andrew Hickl and F. Lacatusu. LCC's GISTexter at DUC 2007: Machine Reading for Update Summarization. Proceedings of the DUC2007. [ Links ]

[3] Q. Bing, L. Ting, C. Shanglin, and L. Sheng. Sentences Optimum Selection for Multi–Document Summarization. Journal of Computer Research and Development, 43(6):1129–1134, 2006. [ Links ]

[4] S. Brin and L. Page. The anatomy of a large–scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30(1–7):107–117, 1998. [ Links ]W.–K. Chen, Linear Networks and Systems (Book style). Belmont, CA: Wadsworth, 1993, pp. 123–135. [ Links ]

[5] J. Conroy, J. Schlesinger, and J. Stewart. CLASSY query based multi–document summarization. Proceedings of the 2005 Document Understanding Workshop, Boston, 2005. [ Links ]

[6] G. Erkan and D. Radev. LexRank: Graph–based Lexical Centrality as Salience in Text Summarization. Journal of Artificial Intelligence Research, 22:457–479, 2004. [ Links ]

[7] A. Farzindar, F. Rozon, and G. Lapalme. CATS a topic oriented multi–document summarization system at DUC 2005. Proceedings of the 2005 Document Understanding Workshop. [ Links ]

[8] J. Ge, X. Huang, and L. Wu. Approaches to event–focused summarization based on named entities and query words. Proceedings of the 2003 Document Understanding Workshop. [ Links ]

[9] E. Hovy, C. Lin, and L. Zhou. A BE–based multi–document summarizer with query interpretation. Proceedings of the DUC2005. [ Links ]

[10] A. Jatowt and M. Ishizuka. Temporal Web Page Summarization. 5th International Conference On Web Information Systems Engineering, Brisbane, Australia, November 22–24, 2004. [ Links ]

[11] J. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM), 46(5):604–632,1999. [ Links ]

[12] C. Kuan–Yu, L. Luesukprasert, and T. Sengcho. Hot Topic Extraction Based on Timeline Analysis and Multidimensional Sentence Modeling. IEEE Transactions on Knowledge and Data Engineering 19, 8 (Aug. 2007), pages 1016–1025, 2007. [ Links ]

[13] M. L. Q. W. K. Li, W.J. and Wu. Integrating temporal distribution information into event–based summarization. International Journal of Computer Processing of Oriental Languages, 19:201–222, 2006. [ Links ]

[14] J. Lim, I. Kang, J.J. Bae, and J. Lee. Sentence extraction using time features in multi–document summarization. In Proceedings of the Asia Information Retrieval Symposium 2004, pages 82–93. [ Links ]

[15] C. Lin. ROUGE: A Package for Automatic Evaluation of Summaries. Proceedings of the Workshop on Text Summarization Branches Out, pages 25–26, 2004. [ Links ]

[16] I. Mani. Recent Developments in Temporal Information Extraction (Draft). Nicolov, N., and Mitkov, R. Proceedings of RANLP, 3, 2004. [ Links ]

[17] R. Mihalcea and P. Tarau. TextRank: Bringing Order into Texts. In In Proceedings of Empirical Methods in Natural Language Processing 2004. [ Links ]

[18] A. Nenkova, R. Passonneau, and K. McKeown. The pyramid method: Incorporating human content selection variation in summarization evaluation. ACM Trans. Speech Lang. Process., 4(2):4, 2007. [ Links ]

[19] R. K. Prasad Pingali and V. Varma. IIIT Hyderabad at DUC 2007. Proceedings of the DUC2007. [ Links ]

[20] D. Radev, H. Jing, M. Sty's, and D. Tam. Centroid based summarization of multiple documents. Information Processing and Management, 40(6):919–938, 2004. [ Links ]

[21] H. Saggion, K. Bontcheva, and H. Cunningham. Robust Generic and Query based Summarization. 10th Conference of the European Chapter of the Association for Computational Linguistics, EACL–2003. [ Links ]

[22] R. Swan and D. Jensen. Constructing Topic–Specific Timelines with Statistical Models of Word Usage. Proceedings of the 6th ACM Conference on Knowledge Discovery and Data Mining (SIGKDD), pages 73–80, 2000. [ Links ]

[23] X. Wan, J. Yang, and J. Xiao. Manifold–ranking based topic–focused multi–document summarization. In IJCAI, pages 2903–2908, 2007. [ Links ]

[24] D. Zhou, O. Bousquet, T. Lai, J. Weston, and B. Scholkopf. Learning with Local and Global Consistency. In Proceedings of NIPS2003, 2003. [ Links ]

[25] D. Zhou, J. Weston, A. Gretton, O. Bousquet, and B. Scholkopf. Ranking on Data Manifolds. In Proceedings of NIPS2003, 2003. [ Links ]

[26] M.–Y. K. W. S. L. L. Q. Ziheng Lin, Tat–Seng Chua and S. Ye. NUS at DUC 2007: Using Evolutionary Models of Text. Proceedings of the DUC2007. [ Links ]