versión On-line ISSN 1870-9044
Polibits no.43 México ene./jun. 2011
Assesing the FeatureDriven Nature of Similaritybased Sorting of Verbs
Pinar Öztürk, Mila Vulchanova, Christian Tumyr, Liliana Martinez, and David Kabath
Norwegian University of Science and Technology, Trondheim, Norway (email: Pinar.Ozturk@ifi.unit.no).
Manuscript received November 16, 2010.
Manuscript accepted for publication January 22, 2011.
The paper presents a computational analysis of the results from a sorting task with motion verbs in Norwegian. The sorting behavior of humans rests on the features they use when they compare two or more words. We investigate what these features are and how differential each feature may be in sorting. The key rationale for our method of analysis is the assumption that a sorting task rests on a similarity assessment process. The main idea is that a set of features underlies this similarity judgment, and similarity between two verbs amounts to the sum of the weighted similarity between the given set of features. The computational methodology used to investigate the features is as follows. Based on the frequency of cooccurrence of verbs in the human generated cluster, weights of a given set of features are computed using linear regression. The weights are used, in turn, to compute a similarity matrix between the verbs. This matrix is used as an input for the agglomerative hierarchical clustering. If the selected/projected set of features aligns with the features the participants used when sorting verbs in groups, then the clusters we obtain using this computational method would align with the clusters generated by humans. Otherwise, the method proceeds with modifying the feature set and repeating the process. Features promoting clusters that align with humangenerated clusters are evaluated by a set of human experts and the results show that the method manages to identify the appropriate feature sets. This method can be applied in analyzing a variety of data ranging from experimental free production data, to linguistic data from controlled experiments in the assessment of semantic relations and hierarchies within languages and across languages.
Key words: Verb features, verb sorting, similarity.
 E. S. Cordingley, "Knowledge elicitation techniques for knowledgebased systems," in Knowledge elicitation: principie, techniques and applications. New York, NY, USA: SpringerVerlag New York, Inc., 1989, pp. 87175. [ Links ]
 J. Geiwitz, J. Kornell, and B. P. McCloskey, "An expert system for the selection of knowledge acquisition techniques," Santa Barbara, CA: Anacapa Sciences, 1990, technical Report 7852. [ Links ]
 D. Roberson, I. R. L. Davies, G. G. Corbett, and M. Vandervyver, "Freesorting of colors across cultures: Are there universal grounds for grouping?" Journal of Cognition and Culture, vol. 5, no. 3, pp. 349386, 2005. [ Links ]
 E. Rosch, C. B. Mervis, W. D. Gray, D. M. Johson, and P. BoyesBream, "Basic objects in natural categories," Cognitive Psychology, vol. 8, pp. 382439, 1976. [ Links ]
 M. Vulchanova, L. Martinez, and O. Edsberg, "A basic level category for the encoding of biological motion," in Conceptual Spaces and the Construal of Spatial Meaning. Empirical evidence from human communication, J. Hudson, C. Paradis, and U. Magnusson, Eds. Oxford: Oxford University Press, in press. [ Links ]
 K. Coventry, M. Vulchanova, T. Cadierno, L. Martinez, and R. Pajusalu, "Locomotion below the basic level: Sorting verbs across languages," in preparation. [ Links ]
 S. Padó and M. Lapata, "Dependencybased construction of semantic space models," Computational Linguistics, vol. 33, no. 2, pp. 161199, 2007. [ Links ]
 M. Baroni, B. Murphy, E. Barbu, and M. Poesio, "Strudel: A corpusbased semantic model based on properties and types," Cognitive Science, vol. 34, no. 2, pp. 222254, 2010. [ Links ]
 S. McDonald, "Environmental determinants of lexical processing effort," PhD thesis. University of Edinburgh, 2000. [ Links ]
 E. Dabrowska, "Words as constructions," in New Directions in Cognitive Linguistics, E. Vyvyan and S. Pourcel, Eds. Amsterdam: John Benjamins, 2009. [ Links ]
 B. Zhang and S. Srihari, "Binary vector dissimilarity measures for handwriting identification," in Proceedings of SPIE, vol. 5010, 2003, pp. 2838. [ Links ]
 S. Debnath, N. Ganguly, and P. Mitra, "Feature weighting in content based recommendation system using social network analysis," in WWW '08: Proceeding of the 17th international conference on World Wide Web. New York, NY, USA: ACM, 2008, pp. 10411042. [ Links ]
 L. B. Smith, "Action alters shape categories," Cognitive Science, vol. 29, no. 4, pp. 665679, 2005. [ Links ]
 K. Coventry and S. Garrod, "Saying, seeing and acting: The psychological semantics of spatial prepositions." Hove: Psychology Press, 2004. [ Links ]
 K. AbbotSmith and M. Tomasello, "Exemplarlearning and schematization in a usagebased account of syntactic acquisition," Linguistic Review, vol. 23, no. 3, pp. 275290, 2006. [ Links ]
 E. Dabrowska, "The mean lean grammar machine meets the human mind: Empirical investigations of the mental status of rules." in Cognitive foundations of linguistic usage patterns. Empirical approaches, H. Schmid and S. Handl, Eds. Berlin: Mouton de Gruyter., 2010. [ Links ]
 "Motion encoding in language," Oxford University press, in press. [ Links ]