SciELO - Scientific Electronic Library Online

 
vol.28 número4Exploring Personality Traits: A Big Five Profiling of Fictional Movie CharactersSelection of Content Measures for Evaluation of Text Summaries Using Genetic Algorithms índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

  • No hay artículos similaresSimilares en SciELO

Compartir


Computación y Sistemas

versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546

Resumen

ALCANTARA, Tania et al. LyricScraper: A Dataset of Spanish Song Lyrics Created via Web Scraping and Dual-labeling for LLM Classification. Comp. y Sist. [online]. 2024, vol.28, n.4, pp.2251-2260.  Epub 25-Mar-2025. ISSN 2007-9737.  https://doi.org/10.13053/cys-28-4-5292.

Songs represent a powerful means of expressing emotions through melody and lyrics. This study focuses on understanding and classifying emotions present in songs, ranging from positive and negative to neutral emotions. This classification and understanding would not be possible without data, which was gathered using a proprietary web scraping algorithm to collect lyrics data online. Subsequently, a pseudo-labeling approach based on BERT was employed to assign sentiment labels to these lyrics, leveraging BERT’s ability to comprehend context and semantic relationships in language. This process enhanced the dataset’s quality and contributed to the success of sentiment analysis in songs. The new dataset addressed challenges related to sentence length by providing examples of song lyrics of varying lengths, facilitating more effective model training. Additionally, data imbalance was addressed through careful sample selection, representing a wide range of emotions in songs. This new dataset underwent classification using large-scale language models, achieving promising results. The accuracy metric reached an impressive 97.66% for DistilBERT and 97.83% for the F1 metric, highlighting the effectiveness of this approach in song sentiment analysis. This study underscores the importance of understanding emotions in songs and offers practical solutions to enhance the capabilities of language models in this task.

Palabras llave : NLP; pseudo labeling; web scraping; deep learning.

        · texto en Inglés     · Inglés ( pdf )