SciELO - Scientific Electronic Library Online

 
vol.18 número3Paraphrase and Textual Entailment Generation in CzechMulti-document Summarization using Tensor Decomposition índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

  • No hay artículos similaresSimilares en SciELO

Compartir


Computación y Sistemas

versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546

Comp. y Sist. vol.18 no.3 Ciudad de México jul./sep. 2014

https://doi.org/10.13053/CyS-18-3-2028 

Artículos regulares

 

Vector Space Basis Change in Information Retrieval

 

Rabeb Mbarek1, Mohamed Tmar1, and Hawete Hattab2

 

1 Multimedia Information Systems and Advanced Computing Laboratory, High Institute of Computer Science and Multimedia, University of Sfax, Sfax, Tunisia. rabeb.hattab@gmail.com, mohamedtmar@yahoo.fr

2 Umm Al-Qura University, Makkah, Saudi Arabia. hattab.hawete@yahoo.fr.

 

Article received on 07/01/2014.
Accepted on 30/01/2014.

 

Abstract

The Vector Space Basis Change (VSBC) is an algebraic operator responsible for change of basis and it is parameterized by a transition matrix. If we change the vector space basis, then each vector component changes depending on this matrix. The strategy of VSBC has been shown to be effective in separating relevant documents and irrelevant ones. Recently, using this strategy, some feedback algorithms have been developed. To build a transition matrix some optimization methods have been used. In this paper, we propose to use a simple, convenient and direct method to build a transition matrix. Based on this method we develop a relevance feedback algorithm. Experimental results on a TREC collection show that our proposed method is effective and generally superior to known VSBC-based models. We also show that our proposed method gives a statistically significant improvement over these models.

Keywords: Vector space model, vector space basis change, VSBC-based model, relevance feedback.

 

DESCARGAR ARTÍCULO EN FORMATO PDF

 

References

1. Atreya, A. & Elkan, C. (2010). Latent semantic indexing (LSI) fails for TREC collections. SIGKDD Explorations, 12(Issue 2), 5-10.         [ Links ]

2. Baeza-Yates, R. & Ribeiro-Neto, B. (1999). Modern Information Retrieval. ACM Press, Addison-Wesley.         [ Links ]

3. Basile, P., Caputo, A., & Semeraro, G. (2011). Negation for document re-ranking in ad-hoc retrieval. In ICTIR. 285-296.

4. Cao, G., Nie, J.-Y., Gao, J., & Robertson, S. (2008). Selecting good expansion terms for pseudo-relevance feedback. In SIGIR. 243-250.

5. Croft, B. W. & Harper, D. J. (1979). Using probabilistic models of information without relevance information. Journal of Documentation, 35(4), 285-295.         [ Links ]

6. Croft, W. B., Cronen-Townsend, S., & Lavrenko, V. (2001). Relevance feedback and personalization: A language modelling perspective. In DELOS Workshop. 49-54.         [ Links ]

7. de Campos, L. M., Fernández-Luna, J. M., & Huete, J. F. (2001). Relevance feedback in the Bayesian network retrieval model: An approach based on term instantiation. In IDA. 13-23.

8. Deerwester, S. C., Dumais, S. T., Landauer, T. K., Furnas, G. W., & Harshman, R. A. (1990). Indexing by latent semantic analysis. Journal of the ASIS, 41(6), 391-407.         [ Links ]

9. Harman, D. (1992). Relevance feedback revisited. In SIGIR. 21-24.

10. Ide, E. (1971). New experiments in relevance feedback. In SMART. 337-354.

11. James, A., Connell, M., Croft, W. B., Feng, F., Fisher, D., & Li, X. (2000). INQUERY and TREC-9. In TREC.

12. Jay, M. P. & Croft, W. B. (1968). A language modeling approach to information retrieval. In SIGIR. 275-281.

13. Lv, Y. & Zhai, C. (2010). Positional relevance model for pseudo-relevance feedback. In SIGIR. 579-586.

14. Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to Information Retrieval. Cambridge University Press, UK.         [ Links ]

15. Mbarek, R. & Tmar, M. (2012). Relevance feedback method based on vector space basis change. In SPIRE. 342-347.

16. Mbarek, R., Tmar, M., & Hattab, H. (2014). A new relevance feedback algorithm based on vector space basis change. In Gelbukh, A., editor, Computational Linguistics and Intelligent Text Processing. Proceedings of CICLing 2014, 15th International Conference on Intelligent Text Processing and Computational Linguistics, Kathmandu, Nepal, volume 8404 of Lecture Notes in Computer Science. 355-366.         [ Links ]

17. Mbarek, R., Tmar, M., & Hattab, H. (2014). An optimal context for information retrieval. In AAIM. 323-330.

18. Mbarek, R., Tmar, M., & Hattab, H. (2014). Rocchio model based on vector space basis change for pseudo relevance feedback. In SLATE. 215-224.

19. Melucci, M. (2005). Context modeling and discovery using vector space bases. In CIKM. 808-815.

20. Melucci, M. (2008). A basis for information retrieval in context. ACM Trans. Inf. Syst.l, 26(3), 1-41.         [ Links ]

21. Porter, M. (1980). An algorithm for suffix stripping. Program, 14, 130-137.         [ Links ]

22. Robertson, S. & Spärck-Jones, J. (1976). Relevance weighting of search terms. Journal of the American Society for Information Science, 27(3), 129-146.         [ Links ]

23. Robertson, S. E. & Walker, S. (1994). Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In SIGIR.

24. Robertson, S. E., Walker, S., Hancock-Beaulieu, M., Gull, A., & Lau, M. (1992). Okapi at TREC. In TREC. 21-30.

25. Rocchio, J. (1972). Relevance feedback in information retrieval. In The SMART retrieval system-experiments in automatic document processing. 313-323.         [ Links ]

26. Ruthven, I. & Lalmas, M. (2003). A survey on the use of relevance feedback for information access systems. The Knowledge Engineering Review, 18(2), 95-145.         [ Links ]

27. Ruthven, I., Lalmas, M., & Rijsbergen, K. (2002). Ranking expansion terms with partial and osten-sive evidence. In Fourth international conference on conceptions of library and information science: emerging frameworks and methods. 199-219.         [ Links ]

28. Sakai, T., Manabe, T., & Koyama, M. (2005). Flexible pseudo-relevance feedback via selective sampling. ACM Transactions on Asian Language Information Processinge, 4(2), 111-135.         [ Links ]

29. Salton, G. (1968). Automatic Information Organization and retrieval. McGraw-Hill, New-York.         [ Links ]

30. Salton, G. (1989). Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley.         [ Links ]

31. Salton, W., Wong, S., & Yang, C. (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11), 613-620.         [ Links ]

32. Tao, T. & Zhai, C. (2006). Regularized estimation of mixture models for robust pseudo-relevance feedback. In SIGIR. 162-169.

33. van Rijsbergen, C. (2004). The Geometry of Information Retrieval. Cambridge University Press, Cambridge.         [ Links ]

34. Xu, Y., Jones, G. J., & Wang, B. (2009). Query dependent pseudo-relevance feedback based on Wikipedia. In SIGIR. 59-66.

35. Zhai, C. & Lafierty, J. (2001). Model-based feedback in the language modeling approach to information retrieval. In CIKM. 403-410.

36. Zhou, D., Lawless, S., & Wade, V. (2012). Improving search via personalized query expansion using social media. Information Retrieval, 15, 218-242.         [ Links ]

37. Zhou, D., Truran, M., Liu, J., & Zhang, S. (2013). Collaborative pseudo-relevance feedback. Expert Systems with Applications, 40, 6805-6812.         [ Links ]

Creative Commons License Todo el contenido de esta revista, excepto dónde está identificado, está bajo una Licencia Creative Commons