SciELO - Scientific Electronic Library Online

 
vol.21 issue4Subjectivity Detection in Nuclear Energy TweetsNamed Entity Recognition on Code-Mixed Cross-Script Social Media Content author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • Have no similar articlesSimilars in SciELO

Share


Computación y Sistemas

On-line version ISSN 2007-9737Print version ISSN 1405-5546

Abstract

RADHAKRISHNAN, Priya; JAWAHAR, Ganesh; GUPTA, Manish  and  VARMA, Vasudeva. SNEIT: Salient Named Entity Identification in Tweets. Comp. y Sist. [online]. 2017, vol.21, n.4, pp.665-679. ISSN 2007-9737.  https://doi.org/10.13053/cys-21-4-2864.

Social media is a rich source of information and opinion, with exponential data growth rate. However social media posts are difficult to analyze since they are brief, unstructured and noisy. Interestingly, many social media posts are about an entity or entities. Understanding which entity is central (Salient Entity) to a post, helps better analyze the post. In this paper we propose a model that aids in such analysis by identifying the Salient Entity in a social media post, tweets in particular. We present a supervised machine-learning model, to identify Salient Entity in a tweet and propose that the tweet is most likely about that particular entity. We have used the premise that, when an image accompanies a text, the text most likely is about the entity in that image, to build a dataset of tweets and salient entities. We trained our model using this dataset. Note that this does not restrict the applicability of our model in any way. We use tweets with images only to obtain objective ground truth data, while features for the model are derived from tweet text. Our experiments show that the model identifies Salient Named Entity with an F-measure of 0.63. We show the effectiveness of the proposed model for tweet-filtering and salience identification tasks. We have made the human annotated dataset and the source code of this model publicly available.

Keywords : Entity salience; named entity recognition; semantic search; named entity extraction.

        · text in English     · English ( pdf )