SciELO - Scientific Electronic Library Online

 
vol.28 número4Combining Embeddings and Domain Knowledge for Job Posting Duplicate DetectionOptimization of Essential Oil Extraction from Agastache Mexicana Subsp. Xolocotziana through Surfactant-assisted Hydrodistillation: A Response Surface Approach and Pareto Front Analysis for Enhancing Antioxidant Activity and Yield índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Journal

Artigo

Indicadores

Links relacionados

  • Não possue artigos similaresSimilares em SciELO

Compartilhar


Computación y Sistemas

versão On-line ISSN 2007-9737versão impressa ISSN 1405-5546

Resumo

GOULIEV, Zaur  e  PEREZ-TELLEZ, Fernando. Topic Modelling and Sentiment Analysis via News Headlines, NLP Methods on Australian Broadcasting Commission. Comp. y Sist. [online]. 2024, vol.28, n.4, pp.2103-2115.  Epub 25-Mar-2025. ISSN 2007-9737.  https://doi.org/10.13053/cys-28-4-5305.

The main aim of this paper is to provide a holistic overview, implementation and comparison of some of the main supervised and unsupervised machine learning methods that are used in natural language processing for extracting topics and sentiment from headlines. This paper employs supervised learning methods such as logistic regression, supper vector machine classifier (SVM) and unsupervised learning methods such as K-means clustering and Latent Dirichlet allocation (LDA). To demonstrate these NLP applications, an extensive dataset of one million news headlines is used provided online by the Australian Broadcasting Commission which contains 17 years of news headlines, which provides for rich analysis. Our results show that logistic regression based models which use lexicon-based emotion classifiers score very highly in accuracy for sentiment analysis, reaching 93% and clustering-based techniques K-means scored 75% for topic modelling. An detailed explanation of these methods, along with limitations, assumptions, ethical considerations and suggestions of future work are discussed.

Palavras-chave : News headlines; machine learning; natural language processing; sentiment analysis.

        · texto em Inglês     · Inglês ( pdf )