SciELO - Scientific Electronic Library Online

 
vol.18 issue3Paraphrase and Textual Entailment Generation in CzechMulti-document Summarization using Tensor Decomposition author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • Have no similar articlesSimilars in SciELO

Share


Computación y Sistemas

Print version ISSN 1405-5546

Comp. y Sist. vol.18 n.3 México Jul./Sep. 2014

http://dx.doi.org/10.13053/CyS-18-3-2028 

Artículos regulares

 

Vector Space Basis Change in Information Retrieval

 

Rabeb Mbarek1, Mohamed Tmar1, and Hawete Hattab2

 

1 Multimedia Information Systems and Advanced Computing Laboratory, High Institute of Computer Science and Multimedia, University of Sfax, Sfax, Tunisia. rabeb.hattab@gmail.com, mohamedtmar@yahoo.fr

2 Umm Al-Qura University, Makkah, Saudi Arabia. hattab.hawete@yahoo.fr.

 

Article received on 07/01/2014.
Accepted on 30/01/2014.

 

Abstract

The Vector Space Basis Change (VSBC) is an algebraic operator responsible for change of basis and it is parameterized by a transition matrix. If we change the vector space basis, then each vector component changes depending on this matrix. The strategy of VSBC has been shown to be effective in separating relevant documents and irrelevant ones. Recently, using this strategy, some feedback algorithms have been developed. To build a transition matrix some optimization methods have been used. In this paper, we propose to use a simple, convenient and direct method to build a transition matrix. Based on this method we develop a relevance feedback algorithm. Experimental results on a TREC collection show that our proposed method is effective and generally superior to known VSBC-based models. We also show that our proposed method gives a statistically significant improvement over these models.

Keywords: Vector space model, vector space basis change, VSBC-based model, relevance feedback.

 

DESCARGAR ARTÍCULO EN FORMATO PDF

 

References

1. Atreya, A. & Elkan, C. (2010). Latent semantic indexing (LSI) fails for TREC collections. SIGKDD Explorations, 12(Issue 2), 5-10.         [ Links ]

2. Baeza-Yates, R. & Ribeiro-Neto, B. (1999). Modern Information Retrieval. ACM Press, Addison-Wesley.         [ Links ]

3. Basile, P., Caputo, A., & Semeraro, G. (2011). Negation for document re-ranking in ad-hoc retrieval. In ICTIR. 285-296.

4. Cao, G., Nie, J.-Y., Gao, J., & Robertson, S. (2008). Selecting good expansion terms for pseudo-relevance feedback. In SIGIR. 243-250.

5. Croft, B. W. & Harper, D. J. (1979). Using probabilistic models of information without relevance information. Journal of Documentation, 35(4), 285-295.         [ Links ]

6. Croft, W. B., Cronen-Townsend, S., & Lavrenko, V. (2001). Relevance feedback and personalization: A language modelling perspective. In DELOS Workshop. 49-54.         [ Links ]

7. de Campos, L. M., Fernández-Luna, J. M., & Huete, J. F. (2001). Relevance feedback in the Bayesian network retrieval model: An approach based on term instantiation. In IDA. 13-23.

8. Deerwester, S. C., Dumais, S. T., Landauer, T. K., Furnas, G. W., & Harshman, R. A. (1990). Indexing by latent semantic analysis. Journal of the ASIS, 41(6), 391-407.         [ Links ]

9. Harman, D. (1992). Relevance feedback revisited. In SIGIR. 21-24.

10. Ide, E. (1971). New experiments in relevance feedback. In SMART. 337-354.

11. James, A., Connell, M., Croft, W. B., Feng, F., Fisher, D., & Li, X. (2000). INQUERY and TREC-9. In TREC.

12. Jay, M. P. & Croft, W. B. (1968). A language modeling approach to information retrieval. In SIGIR. 275-281.

13. Lv, Y. & Zhai, C. (2010). Positional relevance model for pseudo-relevance feedback. In SIGIR. 579-586.

14. Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to Information Retrieval. Cambridge University Press, UK.         [ Links ]

15. Mbarek, R. & Tmar, M. (2012). Relevance feedback method based on vector space basis change. In SPIRE. 342-347.

16. Mbarek, R., Tmar, M., & Hattab, H. (2014). A new relevance feedback algorithm based on vector space basis change. In Gelbukh, A., editor, Computational Linguistics and Intelligent Text Processing. Proceedings of CICLing 2014, 15th International Conference on Intelligent Text Processing and Computational Linguistics, Kathmandu, Nepal, volume 8404 of Lecture Notes in Computer Science. 355-366.         [ Links ]

17. Mbarek, R., Tmar, M., & Hattab, H. (2014). An optimal context for information retrieval. In AAIM. 323-330.

18. Mbarek, R., Tmar, M., & Hattab, H. (2014). Rocchio model based on vector space basis change for pseudo relevance feedback. In SLATE. 215-224.

19. Melucci, M. (2005). Context modeling and discovery using vector space bases. In CIKM. 808-815.

20. Melucci, M. (2008). A basis for information retrieval in context. ACM Trans. Inf. Syst.l, 26(3), 1-41.         [ Links ]

21. Porter, M. (1980). An algorithm for suffix stripping. Program, 14, 130-137.         [ Links ]

22. Robertson, S. & Spärck-Jones, J. (1976). Relevance weighting of search terms. Journal of the American Society for Information Science, 27(3), 129-146.         [ Links ]

23. Robertson, S. E. & Walker, S. (1994). Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In SIGIR.

24. Robertson, S. E., Walker, S., Hancock-Beaulieu, M., Gull, A., & Lau, M. (1992). Okapi at TREC. In TREC. 21-30.

25. Rocchio, J. (1972). Relevance feedback in information retrieval. In The SMART retrieval system-experiments in automatic document processing. 313-323.         [ Links ]

26. Ruthven, I. & Lalmas, M. (2003). A survey on the use of relevance feedback for information access systems. The Knowledge Engineering Review, 18(2), 95-145.         [ Links ]

27. Ruthven, I., Lalmas, M., & Rijsbergen, K. (2002). Ranking expansion terms with partial and osten-sive evidence. In Fourth international conference on conceptions of library and information science: emerging frameworks and methods. 199-219.         [ Links ]

28. Sakai, T., Manabe, T., & Koyama, M. (2005). Flexible pseudo-relevance feedback via selective sampling. ACM Transactions on Asian Language Information Processinge, 4(2), 111-135.         [ Links ]

29. Salton, G. (1968). Automatic Information Organization and retrieval. McGraw-Hill, New-York.         [ Links ]

30. Salton, G. (1989). Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley.         [ Links ]

31. Salton, W., Wong, S., & Yang, C. (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11), 613-620.         [ Links ]

32. Tao, T. & Zhai, C. (2006). Regularized estimation of mixture models for robust pseudo-relevance feedback. In SIGIR. 162-169.

33. van Rijsbergen, C. (2004). The Geometry of Information Retrieval. Cambridge University Press, Cambridge.         [ Links ]

34. Xu, Y., Jones, G. J., & Wang, B. (2009). Query dependent pseudo-relevance feedback based on Wikipedia. In SIGIR. 59-66.

35. Zhai, C. & Lafierty, J. (2001). Model-based feedback in the language modeling approach to information retrieval. In CIKM. 403-410.

36. Zhou, D., Lawless, S., & Wade, V. (2012). Improving search via personalized query expansion using social media. Information Retrieval, 15, 218-242.         [ Links ]

37. Zhou, D., Truran, M., Liu, J., & Zhang, S. (2013). Collaborative pseudo-relevance feedback. Expert Systems with Applications, 40, 6805-6812.         [ Links ]

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License