SciELO - Scientific Electronic Library Online

 
 issue51Classification of Group Potency Levels of Software Development Student TeamsApplying the Technology Acceptance Model to Evaluation of Recommender Systems author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • Have no similar articlesSimilars in SciELO

Share


Polibits

On-line version ISSN 1870-9044

Abstract

JIMENEZ, Sergio; GONZALEZ, Fabio A.  and  GELBUKH, Alexander. Soft Cardinality in Semantic Text Processing: Experience of the SemEval International Competitions. Polibits [online]. 2015, n.51, pp.63-72. ISSN 1870-9044.  http://dx.doi.org/10.17562/PB-51-9.

Soft cardinality is a generalization of the classic set cardinality (i.e., the number of elements in a set), which exploits similarities between elements to provide a "soft" counting of the number of elements in a collection. This model is so general that can be used interchangeability as cardinality function in resemblance coefficients such as Jaccard's, Dice's, cosine and others. Beyond that, cardinality-based features can be extracted from pairs of objects being compared to learn adaptive similarity functions from training data. This approach can be used for comparing any object that can be represented as a set or bag. We and other international teams used soft cardinality to address a series of natural language processing (NLP) tasks in the recent SemEval (semantic evaluation) competitions from 2012 to 2014. The systems based on soft cardinality have always been among the best systems in all the tasks in which they participated. This paper describes our experience in that journey by presenting the generalities of the model and some practical techniques for using soft cardinality for NLP problems.

Keywords : Similarity measure; soft computing; set cardinality; semantics; natural language processing.

        · text in English     · English ( pdf )

 

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License