SciELO - Scientific Electronic Library Online

 número43Detecting Derivatives using Specific and Invariant DescriptorsSemantic Textual Entailment Recognition using UNL índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados




Links relacionados

  • No hay artículos similaresSimilares en SciELO



versión On-line ISSN 1870-9044

Polibits  no.43 México ene./jun. 2011


Assesing the Feature–Driven Nature of Similarity–based Sorting of Verbs


Pinar Öztürk, Mila Vulchanova, Christian Tumyr, Liliana Martinez, and David Kabath


Norwegian University of Science and Technology, Trondheim, Norway (e–mail:


Manuscript received November 16, 2010.
Manuscript accepted for publication January 22, 2011.



The paper presents a computational analysis of the results from a sorting task with motion verbs in Norwegian. The sorting behavior of humans rests on the features they use when they compare two or more words. We investigate what these features are and how differential each feature may be in sorting. The key rationale for our method of analysis is the assumption that a sorting task rests on a similarity assessment process. The main idea is that a set of features underlies this similarity judgment, and similarity between two verbs amounts to the sum of the weighted similarity between the given set of features. The computational methodology used to investigate the features is as follows. Based on the frequency of co–occurrence of verbs in the human generated cluster, weights of a given set of features are computed using linear regression. The weights are used, in turn, to compute a similarity matrix between the verbs. This matrix is used as an input for the agglomerative hierarchical clustering. If the selected/projected set of features aligns with the features the participants used when sorting verbs in groups, then the clusters we obtain using this computational method would align with the clusters generated by humans. Otherwise, the method proceeds with modifying the feature set and repeating the process. Features promoting clusters that align with human–generated clusters are evaluated by a set of human experts and the results show that the method manages to identify the appropriate feature sets. This method can be applied in analyzing a variety of data ranging from experimental free production data, to linguistic data from controlled experiments in the assessment of semantic relations and hierarchies within languages and across languages.

Key words: Verb features, verb sorting, similarity.





[1] E. S. Cordingley, "Knowledge elicitation techniques for knowledge–based systems," in Knowledge elicitation: principie, techniques and applications. New York, NY, USA: Springer–Verlag New York, Inc., 1989, pp. 87–175.         [ Links ]

[2] J. Geiwitz, J. Kornell, and B. P. McCloskey, "An expert system for the selection of knowledge acquisition techniques," Santa Barbara, CA: Anacapa Sciences, 1990, technical Report 785–2.         [ Links ]

[3] D. Roberson, I. R. L. Davies, G. G. Corbett, and M. Vandervyver, "Free–sorting of colors across cultures: Are there universal grounds for grouping?" Journal of Cognition and Culture, vol. 5, no. 3, pp. 349–386, 2005.         [ Links ]

[4] E. Rosch, C. B. Mervis, W. D. Gray, D. M. Johson, and P. Boyes–Bream, "Basic objects in natural categories," Cognitive Psychology, vol. 8, pp. 382–439, 1976.         [ Links ]

[5] M. Vulchanova, L. Martinez, and O. Edsberg, "A basic level category for the encoding of biological motion," in Conceptual Spaces and the Construal of Spatial Meaning. Empirical evidence from human communication, J. Hudson, C. Paradis, and U. Magnusson, Eds. Oxford: Oxford University Press, in press.         [ Links ]

[6] K. Coventry, M. Vulchanova, T. Cadierno, L. Martinez, and R. Pajusalu, "Locomotion below the basic level: Sorting verbs across languages," in preparation.         [ Links ]

[7] S. Padó and M. Lapata, "Dependency–based construction of semantic space models," Computational Linguistics, vol. 33, no. 2, pp. 161–199, 2007.         [ Links ]

[8] M. Baroni, B. Murphy, E. Barbu, and M. Poesio, "Strudel: A corpus–based semantic model based on properties and types," Cognitive Science, vol. 34, no. 2, pp. 222–254, 2010.         [ Links ]

[9] S. McDonald, "Environmental determinants of lexical processing effort," PhD thesis. University of Edinburgh, 2000.         [ Links ]

[10] E. Dabrowska, "Words as constructions," in New Directions in Cognitive Linguistics, E. Vyvyan and S. Pourcel, Eds. Amsterdam: John Benjamins, 2009.         [ Links ]

[11] B. Zhang and S. Srihari, "Binary vector dissimilarity measures for handwriting identification," in Proceedings of SPIE, vol. 5010, 2003, pp. 28–38.         [ Links ]

[12] S. Debnath, N. Ganguly, and P. Mitra, "Feature weighting in content based recommendation system using social network analysis," in WWW '08: Proceeding of the 17th international conference on World Wide Web. New York, NY, USA: ACM, 2008, pp. 1041–1042.         [ Links ]

[13] L. B. Smith, "Action alters shape categories," Cognitive Science, vol. 29, no. 4, pp. 665–679, 2005.         [ Links ]

[14] K. Coventry and S. Garrod, "Saying, seeing and acting: The psychological semantics of spatial prepositions." Hove: Psychology Press, 2004.         [ Links ]

[15] K. Abbot–Smith and M. Tomasello, "Exemplar–learning and schematization in a usage–based account of syntactic acquisition," Linguistic Review, vol. 23, no. 3, pp. 275–290, 2006.         [ Links ]

[16] E. Dabrowska, "The mean lean grammar machine meets the human mind: Empirical investigations of the mental status of rules." in Cognitive foundations of linguistic usage patterns. Empirical approaches, H. Schmid and S. Handl, Eds. Berlin: Mouton de Gruyter., 2010.         [ Links ]

[17] "Motion encoding in language," Oxford University press, in press.         [ Links ]

Creative Commons License Todo el contenido de esta revista, excepto dónde está identificado, está bajo una Licencia Creative Commons