SciELO - Scientific Electronic Library Online

 
vol.22 número1Stylometry-based Approach for Detecting Writing Style Changes in Literary TextsCharacter Embedding for Language Identification in Hindi-English Code-mixed Social Media Text índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Journal

Artigo

Indicadores

Links relacionados

  • Não possue artigos similaresSimilares em SciELO

Compartilhar


Computación y Sistemas

versão On-line ISSN 2007-9737versão impressa ISSN 1405-5546

Resumo

ASNANI, Kavita  e  PAWAR, Jyoti D.. Extraction of Code-mixed Aspect Topics in Semantic Representation. Comp. y Sist. [online]. 2018, vol.22, n.1, pp.55-63. ISSN 2007-9737.  https://doi.org/10.13053/cys-22-1-2771.

With recent advancements and popularity of social networking forums, millions of people virtually connected to the World Wide Web, commonly communicate in multiple languages. This has led to the generation of large volumes of unstructured code-mixed social media text having useful aspects of information highly dispersed. Aspect based opinion mining relates opinion targets to their polarity values, in a specific context. It is known that since aspects are often implicit, detecting and retrieving them is a difficult task. Moreover, it is very challenging as the code-mixed social media text suffers from its associated linguistic complexities. As a standard, topic modeling has a potential of extracting aspects pertaining to opinion data from large text. This results not only in retrieval of implicit aspects but also in clustering them together. In this paper we propose knowledge based language independent code-mixed semantic LDA (lcms-LDA) model, with an aim to improve the coherence of clusters. We find that the proposed lcms-LDA model infers topic distributions without language barrier, based on semantics associated with words. Our experimental results showed an increase in the UMass and KL divergence score indicating an improved performance in the resulting coherence and distinctiveness of aspect clusters in comparison with the state-of-the-art techniques used for aspect extraction of code-mixed data.

Palavras-chave : Code-mixed aspect extraction; knowledge-based topic modeling; semantic clustering; BabelNet; language independent semantic word association.

        · texto em Inglês     · Inglês ( pdf )