SciELO - Scientific Electronic Library Online

 
vol.28 número2Multi-Objective Evolutionary Algorithm based on Decomposition with Adaptive Adjustment of Control Parameters to Solve the Bi-Objective Internet Shopping Optimization Problem (MOEA/D-AACPBIShOP)Prediction of Enterprise Financial Health Using Machine Learning and Financial Reasons for Taiwan Economic Companies índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

  • No hay artículos similaresSimilares en SciELO

Compartir


Computación y Sistemas

versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546

Resumen

BRAMBILA-HERNANDEZ, José Alfredo et al. Novel Dynamic Decomposition-Based Multi-Objective Evolutionary Algorithm Using Reinforcement Learning Adaptive Operator Selection (DMOEA/D-SL). Comp. y Sist. [online]. 2024, vol.28, n.2, pp.739-749.  Epub 31-Oct-2024. ISSN 2007-9737.  https://doi.org/10.13053/cys-28-2-5018.

Within the multi-objective (static) optimization field, various works related to the adaptive selection of genetic operators can be found. These include multi-armed bandit-based methods and probability-based methods. For dynamic multi-objective optimization, finding this type of work is very difficult. The main characteristic of dynamic multi-objective optimization is that its problems do not remain static over time; on the contrary, its objective functions and constraints change over time. Adaptive operator selection is responsible for selecting the best variation operator at a given time within a multi-objective evolutionary algorithm process. This work proposes incorporating a new adaptive operator selection method into a Dynamic Multi-objective Evolutionary Algorithm Based on Decomposition algorithm, which we call DMOEA/D-SL. This new adaptive operator selection method is based on a reinforcement learning algorithm called State-Action-Reward-State-Action Lambda or SARSA (λ). SARSA Lambda trains an Agent in an environment to make sequential decisions and learn to maximize an accumulated reward over time; in this case, select the best operator at a given moment. Eight dynamic multi-objective benchmark problems have been used to evaluate algorithm performance as test instances. Each problem produces five Pareto fronts. Three metrics were used: Inverted Generational Distance, Generalized Spread, and Hypervolume. The non-parametric statistical test of Wilcoxon was applied with a statistical significance level of 5% to validate the results.

Palabras llave : Adaptive; operator; selection; dynamic; multi-objective; optimization.

        · texto en Inglés     · Inglés ( pdf )