SciELO - Scientific Electronic Library Online

 
vol.28 número2Impact of 5G Technology on Cybersecurity: A Comprehensive Systematic and Bibliometric ReviewAn IoE-SVM Based Statistical Investigation to Measure Effect of Air Pollutant Substances on Student’s Attention Level índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

  • No hay artículos similaresSimilares en SciELO

Compartir


Computación y Sistemas

versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546

Resumen

LOPEZ-MEDINA, Karen Pamela; URIARTE-ARCIA, Abril Valeria  y  YANEZ-MARQUEZ, Cornelio. Data Stream Classification based on an Associative Classifier. Comp. y Sist. [online]. 2024, vol.28, n.2, pp.387-400.  Epub 31-Oct-2024. ISSN 2007-9737.  https://doi.org/10.13053/cys-28-2-4737.

Currently, the diversity of sources generating data in a massive online manner cause data streams to become part of many real work applications. Learning from a data stream is a very challenging task due to the non-stationary nature of this type of data. Characteristics such as infinite length, concept drift, concept evolution and recurrent concepts are the most common problems that need to be addressed by data stream learning algorithms. In this work an algorithm for data stream classification based on an associative classifier is presented. This proposal combines a clustering algorithm and the Naïve Associative Classifier for Online Data (NACOD) to address this problem. A set of micro-clusters (MCs), a data structure that summarizes the information of the current data, is used instead of storing the whole data. The MCs are continually updated with the arriving data, either to create new MCs or to update existing ones. The added MCs helps to deal with concept drift. To assess the performance of the proposed model, experiments were carried out on 3 data sets commonly used to evaluate data stream classification algorithms: KDD Cup 1999, Forest Cover Type and Statlog (Shuttle). Our model achieved higher accuracies than those achieved with algorithms such as data stream version of Naïve Bayes and Hoeffding Tree, the average accuracies achieved were for KDD Cup 1999: 100 %, Statlog (Shuttle): 99.01 % and Forest Cover Type 70.44 %.

Palabras llave : Data stream classification; associative classifier; concept-drift.

        · texto en Inglés     · Inglés ( pdf )