SciELO - Scientific Electronic Library Online

 
 número43Contextual Analysis of Mathematical Expressions for Advanced Mathematical SearchAre my Children Old Enough to Read these Books? Age Suitability Analysis índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Polibits

versión On-line ISSN 1870-9044

Polibits  no.43 México ene./jun. 2011

 

Semantic Aspect Retrieval for Encyclopedia

 

Chao Han, Yicheng Liu, Yu Hao, and Xiaoyan Zhu

 

Department of Computer Science and Technology, Tsinghua University, China (e–mail: hanc04@gmail.com).

 

Manuscript received November 1, 2010.
Manuscript accepted for publication December 21, 2010.

 

Abstract

With the development of Web 2.0, more and more people contribute their knowledge to the Internet. Many general and domain–specific online encyclopedia resources become available, and they are valuable for many Natural Language Processing (NLP) applications, such as summarization and question–answering. We propose a novel encyclopedia–specific method to retrieve passages which are semantically related to a short query (usually comprises of only one word/phrase) from a given article in the encyclopedia. The method captures the expression word features and categorical word features in the surrounding snippets of the aspect words by setting up massive hybrid language models. These local models outperform the global models such as LSA and ESA in our task.

Key words: Aspect retrieval, online encyclopedia, semantic relatedness.

 

DESCARGAR ARTÍCULO EN FORMATO PDF

 

REFERENCES

[1] J. Lin, D. Quan, V. Sinha, K. Bak–shi, D. Huynh, B. Katz, and D. R. Karger, "The role of context in question answering systems," in Proceedings of the 2003 Conference on Human Factors in Computing Systems, 2003.         [ Links ]

[2] S. Ye, T. Chua and J. Lu, "Summarizing Definition from Wikipedia," in Proceedings of the 47th Annual Meeting of the ACL. Singapore, 2009.         [ Links ]

[3] C. Li, N. Yan, S. B. Roy, L. Lisham and G. Das, "Facetedpedia: Dynamic Generation of Query Dependent Faceted Interfaces for Wikipedia," in Proceedings of International World Wide Web Conference, Raleigh, North Carolina, USA, 2010.         [ Links ]

[4] R. Hahn, C. Bizer, C. Sahnwaldt, C. Herta, S. Robinson, M. Brgle, H. Dwiger, and U. Scheel, "Faceted Wikipedia Search," in 13th International Conference on Business Information Systems (BIS), 2010.         [ Links ]

[5] R. B. Yates and B. R. Neto, Modern Information Retrieval, Addison Wesley, New York, NY. 1999.         [ Links ]

[6] C. Fellbaum, WordNet: An Electronic Lexical Database, MIT Press, Cambridge, MA. 1998.         [ Links ]

[7] A. Budanitsky and G. Hirst, "Evaluating Wordnet–based Measures of Lexical Semantic Relatedness," Computational Linguistics, 2006, pp. 13–47.         [ Links ]

[8] P. Roget, Roget's Thesaurus of English Wordsand Phrases, Longman Group Ltd., 1852.         [ Links ]

[9] S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer and R. Harsh–man, "Indexing by Latent Semantic Analysis," Journal of the American Society For Information Science, 1990, pp. 391–407.         [ Links ]

[10] E. Gabrilovich and S. Markovitch, "Computing Semantic Relatedness Using Wikipedia–based Explicit Semantic Analysis," in Proceedings of IJCAI, 2007, pp. 1606–1611.         [ Links ]

[11] E. Hatcher and O. Gospodnetic, Lucene in action, Manning Publications, 2005.         [ Links ]

[12] J. M. Ponte, and W. B. Croft, "A Language Modeling Approach to Information Retrieval," in Proceedings of the 21st Intl. ACM SIGIR Conf., 1998, pp. 275–281.         [ Links ]