The Gradient Subspace Approximation as Local Search Engine within Evolutionary Multi-objective Optimization Algorithms

Alvarado, Sergio; Segura, Carlos; Schütze, Oliver; Zapotecas, Saúl; Alvarado, Sergio; Segura, Carlos; Schütze, Oliver; Zapotecas, Saúl

doi:10.13053/cys-22-2-2948

Services on Demand

Journal

Article

Indicators

Cited by SciELO
Access statistics

Computación y Sistemas

On-line version ISSN 2007-9737Print version ISSN 1405-5546

Comp. y Sist. vol.22 n.2 Ciudad de México Apr./Jun. 2018 Epub Jan 21, 2021

https://doi.org/10.13053/cys-22-2-2948

Articles of the Thematic Issue

The Gradient Subspace Approximation as Local Search Engine within Evolutionary Multi-objective Optimization Algorithms

Sergio Alvarado¹

Carlos Segura²

Oliver Schütze¹

Saúl Zapotecas³

^¹ CINVESTAV-IPN, Computer Science Department, Mexico.

^² Center for Research in Mathematics (CIMAT), Guanajuato, Mexico.

^³ Universidad Autónoma Metropolitana, Cuajimalpa Unit, Department of Applied Mathematics and Systems, Division of Natural Sciences and Engineering, Mexico.

Abstract:

In this paper, we argue that the gradient subspace approximation (GSA) is a powerful local search tool within memetic algorithms for the treatment of multi-objective optimization problems. The GSA utilizes the neighborhood information within the current population in order to compute the best approximation of the gradient at a given candidate solution. The computation of the search direction comes hence for free in terms of additional function evaluations within population based search algorithms such as evolutionary algorithms. Its benefits have recently been discussed in the context of scalar optimization. Here, we discuss and adapt the GSA for the case that multiple objectives have to be considered concurrently. We will further on hybridize line searchers that utilize GSA to obtain the search direction with two different multi-objective evolutionary algorithms. Numerical results on selected benchmark problems indicate the strength of the GSA-based local search within the evolutionary strategies.

Keywords: Multi-objective optimization; evolutionary computation; gradient subspace approximation (GSA); memetic algorithms; gradient-free local search; line search method

1 Introduction

In many real-world applications it is necessary to solve optimization problems where several different objectives have to be considered at the same time. As an example, consider the design of a given product where two important objectives are certainly to maximize its performance while minimizing its cost. Such problems are known in literature as multi-objective optimization problems (MOPs). Solving MOPs is in general not an easy task. One important characteristic of MOPs is that their solution sets, the so-called Pareto sets, form objects of dimension k−1, where k is the number of objectives involved in the problem.

This is in contrast to ”classical” scalar optimization problems (SOPs), where one typically obtains one single optimal solution (i.e., a zero dimensional set). Since the Pareto sets can apart from simple academic problems not be expressed analytically, one important task in multi-objective optimization is thus to find a suitable finite size representation of the solution set in order to provide the decision maker an overview of his/her possibilities.

Over the last two decades, specialized evolutionary algorithms, called multi-objective evolutionary algorithms (MOEAs), have caught the interest of many researchers for the treatment of MOPs. Reasons for this include that MOEAs are applicable to a wide range of problems, are of global nature and hence in principle not too sensitive to the initial candidate set, and allow to compute a finite size representation of the Pareto set in one single run of the algorithm (see, e.g., [⁷, ¹, ⁴², ⁵] and references therein).

However, one common drawback of all MOEAs is that they need quite a few function evaluations in order to obtain acceptable Pareto set approximations due to their relatively slow convergence rates. As a possible remedy, researchers have considered memetic algorithms, i.e., MOEAs hybridized with local search strategies that are mainly coming from mathematical programming techniques (e.g. [³⁷, ²¹, ⁴⁰, ¹⁵, ²⁵, ³, ², ¹³, ³⁸]).

However, though memetic strategies increase in general the obtained approximation qualities which is caused by the local convergence of mathematical programming techniques, their overall cost is quite high due to the requirement of the gradient (and maybe even the Hessian), of the objectives. Note that not in all applications gradient information is at hand.

In that case, one can approximate this information e.g. via finite difference methods (e.g., [²⁸]). However, the cost for the approximation of the gradient is linearly proportional to the dimension n of the decision space and even quadratic in case Hessian information is computed.

Even for other gradient free methods so-called exploration movements are required to gather the required information from the neighborhood to find a suitable search direction [¹⁸]. Such movements, that ideally generate the steepest descent direction, also come with in principle n additional function calls making the local search a quite costly procedure.

The Gradient Subspace Approximation (GSA), is a numerical technique that aims to compute the most greedy search direction at a given point x for an objective f out of a given subspace S ([³³]). For unconstrained optimization problems this leads ideally to the normalized negative of the gradient ∇f(x).

One advantage of the GSA is that both inequality and equality constraints can be directly incorporated in the computation of the search direction. Another important aspect is that gradient information is not needed as the given neighborhood information can be exploited. Thus, the computation of the search direction comes in principle for free (in terms of additional function evaluations), within population based search techniques such as evolutionary algorithms. GSA has been investigated within memetic algorithms for the treatment of SOPs in [³³].

In this paper, we apply GSA to the context of multi-objective optimization, i.e., we make a first effort to use and adapt the gradient estimator within the local search engines of memetic MOEAs. As it can be seen, GSA can in principle be coupled with any gradient based local search technique making it a universal and inexpensive tool for multi-objective memetic algorithms.

The remainder of this paper is organized as follows. Section 2, presents some basic concepts required for the understanding of this work. Section 3 presents some motivations and adaptations of GSA to the context of multi-objective optimization. In Section 4, we present two different memetic algorithms that use GSA for the local search engine. Section 5 presents numerical results on some selected benchmark functions on the two memetic algorithms including a comparison to their base algorithms. Finally, we will conclude and give possible paths for future work in Section 6.

2 Basic Concepts

2.1 Multi-Objective Optimization

A continuous MOP can be stated in mathematical terms as:

min⁡x∈ℝnF(x),s.t. gi(x)≤0 i=1,…,p,hj(x)=0 j=1,…,m, (1)

where F is defined as the vector of the objective functions:

F:ℝn→ℝk,F(x)=(f1(x),…,fk(x))T, (2)

and where each objective fi:ℝn→ℝ, i=1,…,k, is for convenience considered to be continuously differentiable. We stress, however, that this smoothness assumption is only needed for the theoretical considerations but not for the realization of the algorithms. The inequality constraints gi, i=1,…,p, and the equality constraints hj, j=1,…,m, form the feasible set Q, i.e.:

Q={x∈ℝn:gi(x)≤0,i=1,…,p, and hj(x)=0,j=1,…,m}. (3)

When defining the optimal solution we cannot proceed as for the classical scalar optimization case since F is vector-valued. For MOPs, optimality is defined based on the concept of dominance [²⁹]:

Definition 1 (a) Letx,y∈Qandfx=f(x),fy=f(y)∈ℝkbe its respective images. We say that the vectorxdominatesy(denoted asx≺y) ifffix≤fiy, i=1,…,k,, and there exist an indexjsuch thatfjx<fjy.

(b) A vectorx*∈Qis called Pareto optimal if there does not exist another pointy∈Qsuch thaty≺x*.
(c) The set of Pareto optimal solutions is called the Pareto set (PS), i.e.,

PS={x∈Q|∃y∈Q,y≺x}. (4)

(d) Finally, the image of the Pareto set,F(PS), is called the Pareto front (PF).

Typically, i.e., under certain mild assumptions on the MOP, both the Pareto set and the Pareto front form (k−1)-dimensional sets [¹⁴].

2.2 Solving a MOP

So far, many different classes of methods for the numerical solution of MOPs exist such as mathematical programming techniques [³⁷, ⁶, ²⁶], continuation methods [³¹, ¹⁴, ³⁴, ³², ³⁰, ²³, ²⁴, ²²], or cell-mapping and subdivision techniques [⁹, ¹⁶, ¹², ²⁷, ¹¹].

In this study we are particularly interested in multi-objective evolutionary algorithms [⁵, ⁷] and memetic algorithms [³⁷, ²¹, ⁴⁰, ¹⁵], as they have demonstrated to be efficient in many cases.

In the following we shortly present two different MOEAs that we use within this work as the base algorithms for our GSA-based local search: the decomposition based algorithm MOEA/D [⁴²] and a memetic variant of the well-known algorithm NGSA-II that utilizes gradient information [¹⁹].

2.2.1 MOEA/D

The Multi-objective Evolutionary Algorithm based on Decomposition (MOEA/D, [⁴²]), is a state-of-the-art evolutionary algorithm that decomposes the given MOP into several auxiliary SOPs. The main idea behind MOEA/D is to solve each of these SOPs simultaneously. At the end of the run of MOEA/D, it constructs an entire approximation of the Pareto front by combining the results obtained from each of the auxiliary SOPs.

There exist several scalarization methods coupled along with the MOEA/D. In this paper we are concerned in particular with the Tchebycheff decomposition [²⁶]. This decomposition is based on the infinity norm. The Tchebycheff decomposition requires a reference point z∈ℝk and a weight w∈ℝk for each subproblem. The Tchebycheff decomposition can be defined as:

T(fxw,z)=max⁡1=1,…,kwi|fix−zi|, (5)

where fx represents the image of x. MOEA/D establishes a way of udpating the reference point, meaning that no additional parameters are required.

Decomposition based methods have many advantages over ”classical” elitist MOEAs that are entirely based on the dominance relation such as the potential for a faster convergence toward the Pareto set [⁴, ³⁶, ³⁸]. In our context, GSA can be applied to all the auxiliary SOPs as done in [³³].

In addition, there is another (yet unexplored), aspect of MOEA/D that makes it an interesting candidate for GSA: it already has implemented a certain neighborhood structure to decide which SOP is ”near” to another one. Thus, one may use this structure to select the neighboring samples xi for a given candidate solution x0 which we, however, have to leave for future work.

The Algorithm 1, presents the pseudocode of MOEA/D using Tchebycheff decomposition. In the algorithm, N represents the number of subproblems and sN represents the size of the neighborhood.

Algorithm 1 Pseudocode of the MOEA/D

2.2.2 IG-NSGA-II

The improved gradient based NSGA-II (IG-NSGA-II) [¹⁹] is a modification of the work proposed in [⁴¹]. The IG-NSGA-II algorithm uses a local search strategy based on the gradient information of the objective functions to compute a descend direction for unconstrained MOPs. In particular, the descend direction for k=2 is constructed according to the work proposed by Lara et al. in [²⁰]:

vL(x)=−(g1(x)‖g1(x)‖2+g2(x)‖g2(x)‖2), (6)

where g1=∇f1(x) and g2=∇f2(x).

The IG-NSGA-II implements two different mechanisms to preserve the diversity of the final solution set. Both mechanisms are based on a chaotic map that are used to generate new candidate solutions. The first mechanism is a new method to initialize the population of the algorithm. The second mechanism computes the spreading λs of the entire population at each generation. If this spreading value is below a certain threshold a new individual is created using a chaotic map. The new individual created by the chaotic approach is included in the current population as a mechanism to explore different regions.

Algorithm 2, presents the pseudocode for the IG-NSGA-II. Here, N again denotes the population size.

Algorithm 2 Pseudocode of the IG-NSGA-II

2.3 Gradient Subspace Approximation

The Gradient Subspace Approximation (GSA) [³³], is a method proposed to approximate the gradient of a function at a given candidate solution. GSA utilizes the information that is present for each generation of an evolutionary algorithm to compute such an approximation. Here, we present a brief description for the unconstrained case. For the treatment of constrained problems (inequality and equality constraints), the reader is referred to [³³].

Consider the following single objective optimization problem:

min⁡x∈ℝnf(x), (7)

where f:ℝn→ℝ is assumed to be differentiable. We say that v∈ℝn is a descend direction of f at x if:

〈v,∇f(x)〉<0, (8)

where ∇f(x) represents the gradient of f at x. It is known (e.g., [²⁸]) that the vector with the maximal decay at x is given by the negative of the gradient. This (normalized) vector can be expressed also as the solution of the following auxiliary SOP:

min⁡v∈ℝn〈v,∇f(x0)〉,s.t. ‖v‖=1. (9)

To exploit the neighboring information of the candidate solution x0, consider that r search directions vr are given along with the r directional derivatives of such search directions 〈vi,∇f(x0)〉,i=1,…,r. If we incorporate the neighborhood information such that the most greedy direction exist in v∈span(v1,…,vr), Problem (9) can be rewritten as follows:

min⁡λ∈ℝr(〈∇f(x0),∑i=1rλivi〉=∑i=1rλi〈∇f(x0),vi〉),s.t. ‖∑i=1rλivi‖22−1=λTVTVλ−1=0, (10)

where

V=(v1,…,vr)∈ℝn×r. (11)

The following result shows that the solution of (10) is unique under certain assumptions and that it can be expressed analytically.

Proposition 1 Let the search directionsv1,…,vr∈ℝn, r≤n, be linearly independent and:

λ¯*:=−(VTV)−1VT∇f(x0). (12)

Then, there exist a single solution ofEquation (10)given by:

λ*:=λ¯*‖Vλ*‖22, (13)

and thus,

v*=−1‖Vλ*‖22V(VTV)−1VT∇f(x0), (14)

is the most greedy search direction inspan{vi,…,vr}.

Equation (14), requires that gradient information is given. To avoid such restriction we can approximate such information using the neighbors of the candidate solution. Given the candidate solution x0, a set of neighbors xi, i=1,…,r, and their respective function values f(xi), i=1,…,r, the gradient information can be approximated such that:

〈v,∇f(x0)〉i=f(xi)−f(x0)‖xi−x0‖2+O(‖xi−x0‖). (15)

In the above equation, the search direction has apparently been chosen as:

vi:=xi−x0‖xi−x0‖2, i=1,…,r. (16)

For more details and for the extension to equality and inequality constrained problems we refer to [³³].

3 Proposed Approach

In this section, we will adapt the local search tool GSA from [³³] to the context of multi-objective optimization problems, in particular for the use within specialized evolutionary algorithms.

To this end, we will (a) investigate the applicability of GSA within MOEAs which includes a way to compute the neighborhood for the chosen samples, (b) discuss how to approximate the Jacobian matrix via GSA, (c) extend Laras descent direction for inequality constrained problems in order to increase its applicability, (d) discuss a particular choice of the step size control, and (e) discuss how to efficiently balance the resources for the GSA based local search.

Finally, we will (f) propose two different memetic algorithms, MOEA/D/GSA and IG-NSGA-II/GSA, that utilize GSA for their local search engines.

3.0.1 Applicability of GSA within MOEAs

In population based scalar optimization one typically observes that many individuals are located around the best found individual (denoted here by x0), apart from very early stages of the search. Thus, for the usage of GSA one can expect that sufficient samples are at hand near x0 to approximate ∇f(x0). The situation, however, changes when considering multi-objective optimization problems as for such problems there does typically not exist one single solution, but an entire manifold of solutions. Hence, there is no single ”best” solution any more where other solutions group around. Instead, the solutions are spread along the solution set.

Thus, the natural question arises if there exist enough samples within multi-objective evolutionary algorithms (or any other set or population based algorithms for the treatment of MOPs) such that GSA can be applied successfully. To answer this question empirically, we consider in the following the neighborhood sizes from populations obtained via MOEA/D for four different MOPs with different characteristics. In all cases, we use the 2-norm in decision to define the neighborhood. More precisely, we define the neighborhood for an individual p0 of a given population P as:

Nδ(p0):={p∈P\{x0}:‖p−p0‖2≤δ}, (17)

however, we stress that in principle any other norm can be chosen.

In our computations, we have used the values δ=0.25, 0.5, and 1. As MOPs we have chosen (see Table 8 for the formulations of all problems used in this work):

(a) CONV: unimodal, n=2 decision variables, k=2 objectives, linear and connected Pareto set, convex Pareto front,
(b) ZDT1: unimodal, n=30, k=2, linear and connected Pareto set, convex Pareto front,
(c) Kursawe: multi-modal, n=3, k=2, disconnected Pareto set and front, non-convex Pareto front, and
(d) DTLZ3 (3): multi-modal, n=12, k=3, connected and linear Pareto set, concave Pareto front.

Table 1 Parameters for the MOEA/D/GSA algorithm

Parameter	Description	Value
Parameter	Description	ZDT	DTLZ
ηc	Distribution index for crossover	20
ηm	Distribution index for mutation	20
pc	Crossover probability	0.95
pm	Mutation probability	1/n
N	Number of subproblems	100	300
k	Number of objective	2	3
r	Number of neighbors for GSA	5
Gmax	Maximum participation GSA	0.2

Table 2 Δ2 indicator results on the ZDT and DTLZ test function

	MOEA/D	MOEA/D/GSA	MOEA/D/NM
ZDT1 (st. dev.)	0.28312 (0.19010)	0.20459 (0.02726)	0.54943 (0.14881)
ZDT2 (st. dev.)	1.93103 (2.55223)	0.21283 (0.02728)	5.25555 (2.54624)
ZDT3 (st. dev.)	0.19964 (0.12773)	0.16347 (0.04867)	0.27417 (0.18484)
ZDT4 (st. dev.)	0.77793 (0.50872)	0.30239 (0.28498)	1.08488 (0.60226)
ZDT6 (st. dev.)	0.27876 (0.04682)	0.61617 (0.27990)	0.32326 (0.06259)
DTLZ1 (st. dev.)	0.30190 (0.71422)	0.28239 (0.77269)	0.36506 (0.80286)
DTLZ2 (st. dev.)	0.06878 (0.00065)	0.06859 (0.00087)	0.06884 (0.00085)
DTLZ3 (st. dev.)	3.50345 (4.28843)	1.43777 (3.22033)	4.07758 (4.36678)
DTLZ4 (st. dev.)	0.44036 (0.00871)	0.43869 (0.01101)	0.44319 (0.01194)
DTLZ5 (st. dev.)	73.02189 (7.48926)	60.40117 (7.51920)	53.95727 (10.55201)
DTLZ6 (st. dev.)	73.02189 (7.48926)	60.40117 (7.51920)	53.95727 (10.55201)
DTLZ7 (st. dev.)	11.86712 (0.47831)	19.64119 (0.61246)	11.97166 (0.61289)

Table 3 Hypervolume indicator results on the ZDT and DTLZ test functions

	MOEA/D	MOEA/D/GSA	MOEA/D/NM
ZDT1 (st. dev.)	24.50301 (0.15101)	24.61340 (0.01050)	24.31119 (0.12874)
ZDT2 (st. dev.)	23.02988 (1.45125)	24.24493 (0.01949)	21.23483 (0.96625)
ZDT3 (st. dev.)	27.94969 (0.34197)	28.01895 (0.04197)	27.61912 (0.63348)
ZDT4 (st. dev.)	24.02487 (0.32408)	24.48137 (0.17950)	23.82835 (0.38275)
ZDT6 (st. dev.)	22.92276 (0.05367)	23.07460 (0.05973)	22.89177 (0.07056)
DTLZ1 (st. dev.)	120.84596 (0.07311)	120.85190 (0.07338)	120.84041 (0.08505)
DTLZ2 (st. dev.)	120.20962 (0.00030)	120.20952 (0.00064)	120.20952 (0.00064)
DTLZ3 (st. dev.)	116.12579 (5.08719)	118.57561 (3.81988)	115.44508 (5.18048)
DTLZ4 (st. dev.)	119.74700 (0.00817)	119.74231 (0.02676)	119.74801 (0.00491)
DTLZ5 (st. dev.)	114.48096 (0.58667)	115.48724 (0.57578)	115.89698 (0.82226)
DTLZ6 (st. dev.)	114.48096 (0.58667)	115.48724 (0.57578)	115.89698 (0.82226)
DTLZ7 (st. dev.)	49.86711 (0.05572)	49.68336 (0.04674)	49.84884 (0.03746)

Table 4 Parameters for the IG-NSGA-II/GSA algorithm

Parameter	Description	Value
ηc	Distribution index for crossover	20
ηm	Distribution index for mutation	20
γ1	Crossover probability	0.9
γ2	Mutation probability	1/n
N	Number of individuals	100
r	Number of neighbors for GSA	5
γ3	Frequency of the local search	3

Table 5 Averaged indicator results for constrained problems

Function	Hypervolume		Δ2		Max Eval. (Nadir point)
Function	IG-NSGA-II	IG-NSGA-II/GSA	IG-NSGA-II	IG-NSGA-II/GSA	Max Eval. (Nadir point)
Belegundu (std. dev.)	212.8721 (0.3205)	213.1354 (0.2261)	1.6844 (0.1587)	1.5900 (0.1030)	3,000 (12,12)
Binh(2) (std. dev.)	10,294.0037 (17.2207)	10,300.6309 (9.8055)	0.7047 (0.2812)	0.6172 (0.1667)	3,000 (250,50)
Binh(4) (std. dev.)	705.4772 (12.2397)	728.7873 (8.0167)	0.8863 (0.1915)	0.5061 (0.1239)	30,000 (5,7,5)
Obayashi (std. dev.)	22.0321 (0.7574)	21.9269 (0.7103)	0.8084 (0.2869)	0.7792 (0.2772)	20,000 (5,5)
Osyczka (std. dev.)	59.8672 (1.4790)	59.5651 (1.7873)	3.2946 (1.9568)	2.3881 (1.4040)	20,000 (30,30)
Osyczka(2) (std. dev.)	12,780.7978 (42.6364)	13,701.1984 (11.0854)	47.8491 (4.3380)	30.8044 (1.0707)	30,000 (0,85)
Srinivas (std. dev.)	212.8191 (0.3723)	213.1927 (0.0714)	1.4871 (0.2364)	1.3925 (0.1769)	3000 (250,50)
Tamaki (std. dev.)	124.3239 (0.0363)	124.3054 (0.0252)	0.1353 (0.0252)	0.1515 (0.0366)	10,000 (5,5,5)
Tanaka (std. dev.)	22.9022 (0.7516)	24.9672 (0.0249)	0.0672 (0.0472)	0.0468 (0.0100)	10,000 (5,5)
Viennet(4) (std. dev.)	190.2843 (0.6212)	191.6098 (0.4552)	0.1015 (0.0048)	0.0978 (0.0060)	10,000 (8,-10,30)

Table 6 W(x) indicator results on the ZDT and DTLZ test functions

	MOEA/D	MOEA/D/GSA	MOEA/D/NM
ZDT1 (st. dev.)	20.0472 (1.3782)	19.5837 (0.0074)	22.0835 (4.1524)
ZDT2 (st. dev.)	20.0472 (1.3782)	19.5837 (0.0074)	22.0835 (4.1524)
ZDT3 (st. dev.)	31.4010 (0.0518)	31.3602 (0.0558)	31.0517 (1.3529)
ZDT4 (st. dev.)	13.8381 (1.0074)	13.1125 (0.0282)	13.7468 (0.7020)
ZDT6 (st. dev.)	28.8086 (0.0168)	22.9416 (0.0608)	28.8137 (0.0143)
DTLZ1 (st. dev.)	7.9511 (0.0105)	7.9359 (0.0259)	8.28029 (0.0200)
DTLZ2 (st. dev.)	23.7744 (0.0040)	23.7826 (0.0055)	21.5239 (0.0002)
DTLZ3 (st. dev.)	24.6343 (0.5718)	24.2808 (0.2980)	24.6249 (6.4775)
DTLZ4 (st. dev.)	23.7954 (0.0054)	23.8013 (0.0049)	21.5240 (0.0006)
DTLZ5 (st. dev.)	47.3060 (0.0003)	47.3059 (0.0002)	47.3190 (0.0068)
DTLZ6 (st. dev.)	51.6431 (0.9658)	50.9752 (1.0471)	62.51037 (2.6750)
DTLZ7 (st. dev.)	268.9945 (0.0001)	268.9987 (0.0001)	325.7539 (0.1023)

Table 7 Averaged indicator results for CF problems

Function	Hypervolume		Δ2
Function	IG-NSGA-II	IG-NSGA-II/GSA	IG-NSGA-II	IG-NSGA-II/GSA
CF1 (St. Dev.)	22.6991 (0.0108)	22.7063 (0.0160)	0.3488 (0.1134)	0.2809 (0.0457)
CF2 (St. Dev.)	23.9966 (0.2304)	24.1846 (0.1409)	3.7094 (0.8217)	2.6267 (1.1457)
CF3 (St. Dev.)	21.7779 (0.7876)	21.9743 (0.8831)	8.8626 (1.3550)	8.6404 (1.7076)
CF4 (St. Dev.)	22.5691 (0.6465)	22.9710 (0.5158)	4.0301 (0.8088)	3.4467 (0.7596)
CF5 (St. Dev.)	20.8112 (1.0878)	20.6415 (1.1346)	10.0656 (2.1827)	11.6749 (3.3711)
CF6 (St. Dev.)	23.6467 (0.4994)	24.0427 (0.2493)	3.6043 (1.8984)	1.7815 (0.5572)
CF7 (St. Dev.)	20.9140 (0.6613)	21.3019 (0.9006)	16.1164 (5.2138)	13.6514 (6.0834)
CF8 (St. Dev.)	153.2641 (3.3233)	154.5096 (3.2633)	55.8651 (7.8279)	47.8166 (10.3452)
CF9 (St. Dev.)	123.3425 (0.7494)	123.6872 (0.3098)	15.1201 (2.8301)	14.6615 (1.9893)

Table 8 Unconstrained Problems Definition

Function Name	Definition
ZDT1	f1(x)=x1f2(x)=g(x)(1−f1(x)g(x))g(x)=1+9n−1∑i=2nxi	0≤xi≤1, i=1,…,nn=30k=2
ZDT2	f1(x)=x1f2(x)=g(x)(1−(f1(x)g(x))2)g(x)=1+9n−1∑i=2nxi	0≤xi≤1, i=1,…,nn=30k=2
ZDT3	f1(x)=x1f2(x)=1−f1(x)g(x)−(f1(x)g(x))sin (10πf1(x))g(x)=1+9n−1∑i=2nxi	0≤xi≤1, i=1,…,nn=30k=2
ZDT4	f1(x)=x1f2(x)=1−g(x)(f1(x)g(x))2g(x)=1+10(n−1)+∑i=2n(xi2−10 cos (4πxi)	−5≤xi≤5, i=1,…,nn=10k=2
ZDT6	f1(x)=1−(e−4x1) sin (6πx1)f2(x)=1−g(x)(f1(x)g(x))2g(x)=1+9(∑i=2nxin−1)0.25	0≤xi≤1, i=1,…,nn=10k=2
DTLZ1	f1(x)=12x1(1+g(x))f2(x)=12(1−x1)(1+g(x))g(x)=100(5+∑i=2n((xi−0.5)2−cos (20π(xi−0.5))))	0≤xi≤1, i=1,…,nn=7k=2
DTLZ2	f1(x)=(1+g(x)) cos (x1π2)f2(x)=(1+g(x)) sin (x1π2)g(x)=∑i=2n(xi−0.5)	0≤xi≤1, i=1,…,nn=11k=2
DTLZ3	f1(x)=(1+g(x)) cos (x1π2)f2(x)=(1+g(x)) sin (x1π2)g(x)=100(10+∑i=2n((xi−0.5)2−cos (20π(xi−0.5))))	0≤xi≤1, i=1,…,nn=11k=2
DTLZ4	f1(x)=(1+g(x)) cos (x1100π2)f2(x)=(1+g(x)) sin (x1100π2)g(x)=∑i=2n(xi−0.5)	0≤xi≤1, i=1,…,nn=11k=2
DTLZ5	f1(x)=(1+g(x)) cos (x1π2) cos ((π4(1+g(x)))(1+2x2g(x)))f2(x)=(1+g(x)) cos (x1π2) sin ((π4(1+g(x)))(1+2x2g(x)))f3(x)=(1+g(x)) sin (x1π2)g(x)=∑i=2n(xi−0.5)2	0≤xi≤1, i=1,…,nn=10k=3
DTLZ6	f1(x)=(1+g(x)) cos (x1π2) cos ((π4(1+g(x)))(1+2x2g(x)))f2(x)=(1+g(x)) cos (x1π2) sin ((π4(1+g(x)))(1+2x2g(x)))f3(x)=(1+g(x)) sin (x1π2)g(x)=∑i=2n(xi−0.5)2	0≤xi≤1, i=1,…,nn=10k=2
DTLZ7	f1(x)=x1f2(x)=x2f3(x)=(1+g(x))h(x)g(x)=1+98∑i=3nxi h(x)=k−∑i=1k−1(xi sin (1+3πxi))1+g(x)	0≤xi≤1, i=1,…,nn=10k=3
CONV	f1(x)=(x1−1)4+(x2−1)2f2(x)=(x1−1)2+(x2−1)2	−5≤xi≤5, −5≤x2≤5n=2, k=2
KURSAWE	f1(x)=∑i=12(−10e−0.2xi2+xi+12)f2(x)=∑i=13(\|xi\|0.8+ 5 sin (xi3))	−5≤xi≤5, i=1,…,nn=3, k=2
DTLZ3 (3)	f1(x)=(1+g(x)) cos (x1π2) cos (x2π2)f2(x)=(1+g(x)) cos (x1π2) sin (x2π2)f3(x)=(1+g(x)) sin (x1π2)g(x)=100(10+∑i=2n((xi−0.5)2− cos (20π(xi−0.5))))	0≤xi≤1, i=1,…,nn=12k=3

Figures 1-4 present the results of our experiments for population size N=100 averaged over 30 independent runs. Shown are the number of neighbors (i.e., the magnitudes of Nδ(p0)), averaged over all runs and all elements of each population generated by MOEA/D. As it can be seen, the magnitudes of the neighborhoods quickly go over 5 which has been detected as an effective value for r in order to obtain a descent direction (and which has also been used for the computations we present in the sequel). These observations are in consensus with other experiments we have done.

Fig. 1 Averaged number of neighbors for CONV

Fig. 2 Averaged number of neighbors for ZDT1

Fig. 3 Averaged number of neighbors for Kursawe

Fig. 4 Averaged number of neighbors for the three-objective DTLZ3

Based on this insight we choose our neighbor-hood N(p0) for a given candidate solution simply via selecting the r nearest elements (using the 2-norm) out of the population P. See Algorithm 3 for a pseudocode.

Algorithm 3 Selection of neighborhood N(p0) with r nearest elements from population P

Hereby we assume that the population is given by μ elements, i.e.:

P={p1,…,pμ}. (18)

We stress that one potential problem when using MOEA/D is that populations can have duplicate individuals (as a point can be the best found solution for several scalarization problems at the same time). Hence, we have to avoid such solutions as else the distance between the candidate solution and the sample is zero, and GSA is not applicable. More sophisticated neighborhood selection strategies that e.g. take into account the condition number of V or the one MOEA/D is using are subject of ongoing research.

3.1 Approximating the Jacobian

Assume we are given a MOP of the form (1), and that we are given a candidate solution x0 together with the r search directions v1,…,vr∈ℝn that form the subspace.

S:=span{v1,…,vr}. (19)

For every objective fi, the best approximation g˜i(x0) of the gradient ∇fi(x0) at a given point x0 within S can according to the above discussion be computed via:

g˜i:=V(VTV)−1VT∇fi(x0)∈ℝn. (20)

The Jacobian matrix J˜(x0) of F at x0 is defined by:

J(x0)=(∇f1(x0)T⋮∇fk(x0)T)∈ℝk×n. (21)

Thus, using (20)), a best approximation J˜(x0) of J(x0)

—in the sense that each row vector of the matrix is the best approximation of the transposed gradient of fi at x0
—can be obtained via:

J˜(x0):=(g˜1T⋮g˜kT)=J(x0)TV(VTV)−1VT. (22)

For the gradient free realization, consider the product of the first two matrices on the right hand side of (22) it is:

ℱ:=J(x0)TV=(〈∇f1(x0),v1〉⋯〈∇f1(x0),vr〉⋮⋮〈∇fk(x0),v1〉⋯〈∇fk(x0),vr〉). (23)

That is, every entry:

mij=〈∇f1(x0),vj〉, (24)

of the matrix ℱ is a directional derivative and can be approximated via the finite difference method described above, and this approximation comes for free in terms of additional function evaluations as the values fi(xj) are already known. Thus, using (22), where ℱ is computed as described, yields a Jacobian approximation without explicitly computing the objectives’ gradients. The algorithm can hence be seen as gradient free.

3.1.1 Laras Descent Direction for Inequality Constrained MOPs

Once the values of J˜(x0) are computed, we can compute the approximation of the descent direction (6) via:

v˜L=(g˜1‖g˜1‖2+g˜2‖g˜2‖2), (25)

where g˜i, i=1,2, represent the row vectors of the matrix J˜(x0). One interesting aspect would be to know under which conditions (25) actually is a descent direction for both objectives which we have to leave for future work.

One main restriction of descent direction (6) is that it is only applicable for unconstrained MOPs. Here, we make a first attempt to extend the descent direction also for inequality contrained problems where we can make use of the gradient projection method of the GSA. In particular, we propose to first compute the direction v˜ and then to project the vector back to the feasible region if needed (refer to Figure 5 for a graphical illustration of the correction where v˜L points outside the feasible region).

Fig. 5 Correction of the obtained descent direction vL to the feasible region Q

Consider that at the candidate solution x0 we have p active inequality constraints and that v˜L points into an infeasible region (i.e., formally we have ∇gi(x0)Tv˜L<0 for at least one of the active constraints gi). To guarantee that for the projection v˜L it holds ∇gi(x0)Tv˜L≥0 for all active inequality constraints, define M˜∈ℝp×r as follows:

M˜=G(x0)TV=〈∇(gi(x)),vj〉,i=1,…,p,j=1,…,r, (26)

where G(x0) is defined as the Jacobian of the active constraints:

G(x0)=(∇g1(x0)T⋮∇gp(x0)T)∈ℝp×n. (27)

Then, compute the matrix K∈ℝr×(r−1) those column matrix build an orthonormal basis of M˜ (which can be done e.g. via a QR-factorization of M˜). The desired projection is thus given by:

v˜P:=VK(VK)TvL. (28)

The gradient free realization is similar as above: note that every entry of M˜ is again a directional derivative (now for the constrained functions). Thus, we can compute the entries m^ij of M˜ via:

m^ij=〈∇gi(x0),vj〉=gi(xj)−gi(x0)‖xj−x0‖2+O(‖xj−x0‖),i=1,…,p, j=1,…,r. (29)

The use of the projected direction has led to satisfying results in our computations. However, also here it is interesting to know under which condition v˜P actually is a descent direction for all objectives which we have to leave as well for future work.

3.1.2 Computing the Step Size

The GSA is a tool to obtain search directions that are used within line search methods. That is, given a current iterate xi and a search direction vi, the new iterate is computed via:

xi+1=xi+tivi, (30)

where ti>0 0 is the chosen step size. The proper choice of ti is a delicate problem: too small step sizes lead on the one hand to a high probability of getting a better (i.e., dominating) solution, but on the other hand the gain may be too small. In particular, this gain is likely smaller than obtained via the evolutionary strategy (and hence, the local search has no effect on the quality but comes with an additional cost and is thus decreasing the overall performance of the algorithm). In turn, if iterates xi are already near to optimal solutions, large step sizes lead to waste of resources as xi+1 will be dominated by xi.

In our computations we choose an initial step size t0 together with Armijo-like backtracking methods (e.g., [¹⁰, ²⁸]). For the selection of the initial step size t0 we have observed that it makes sense to use the “size” of the neighborhood N(xi) as described above.

In early stages of the search, it is likely that this size is much larger compared to the sizes in later stages. More precisely, given x0, and a neighborhood N(x0) with r elements we compute t0 via:

t0:=1n⋅r∑xi∈N(x0)∑j=1n|xij−x0j|, i=1,…,r. (31)

3.1.3 Balancing the Local Search Resources

One of the challenges that the memetic strategies have is to decide how and when to apply the local search technique. In particular, for some MOPs this task can become very difficult to solve. One such problem, next to the cost caused by the local sarch, is that if the local searcher is applied too often it is possible that the solution set loses diversity [¹⁷]. In MOEA/D one can relatively easy measure the percentage of improvement achived by the local search strategy using the scalar functions of each subproblem.

This section describes a mechanism that takes into consideration the improvement obtained by the GSA and how to measure it. Next, it compares the techniques and redistributes the resources between the base evolutionary algorithm and the GSA-based local search technique in each generation. The mechanism discussed in this section is based on the the scheme proposed in [¹⁷] for the treatment of SOPs.

At this point we consider the evolutionary operators and the GSA as two standalone techniques. On each generation the balancing mechanism measures the quality of each technique. To do so, the improvement must be computed for each individual in the population. Lets consider a given individual xij in the population at the j-th generation.

After its offspring xij+1 is calculated, it is possible to compute the quality of the technique that generates the individual. Lets assume that the points xij, xij+1 and its respective images fij=F(xij) and fij+1=F(xij+1) are given. Using the Tchebycheff decomposition described in Equation (5), the quality term can be described as:

q(fij,fij+1,w,z)=T(fij,w,z)−T(fij+1,w,z), (32)

where w represent the weight of the i-th subproblem where xij is the best solution found so far and z represents the ideal point.

Once the qualities of all subproblems in MOEA/D are computed, the method uses such values to compute an average quality for each technique. To compare the quality at generation j, we assume two different populations created according to the creating method. PT1j represents the subpopulation generated by evolutionary techniques and PT2j represents the subpopulation generated with the GSA method. For each one of the techniques the average quality Q(PTij), i=1,2, is given by:

Q(PTij)=1|PTij|∑xij∈PTijqi(xij). (33)

Now, the balance mechanism uses the quality of each technique to compute the number of function calls that each technique is going to use on the next generation. The participation ratios are computed as shown in Algorithm 4. Participation ratios are used to distribute the number of function calls to each technique. In order to ensure that both techniques are applied at least on a percentage of the population, the parameters rmin andrmax are introduced. Such parameters establish a lower and an upper bound for the techniques.

Algorithm 4 Computation of participation ratios

3.1.4 Memetic Algorithms

Based on the above discussions, we will propose in the sequel two different memetic algorithms whose local searchers utilize GSA. The first one, MOEA/D/GSA, is based on MOEA/D and is designed for the treatment of unconstrained MOPs. Algorithm 5 presents the pseudo code of this algorithm. Here, it is important to mention that the initial participation ratio of the GSA is set on a lower value compared with the one of the evolutionary algorithm.

Algorithm 5 Pseudocode of the MOEA/D/GSA

The reason for this is that in principle, when the algorithm starts the neighborhood size of any candidate solution of the population may be very large. Because of this size it is not quite recommendable to apply GSA since it is quite possible that the offspring do not present an improvement (note that GSA is based on forward difference methods which only work properly if the samples x _i are near to the candidate solution x0).

The next memetic strategy we propose here is IG-NSGA-II/GSA whose base algorithm is IG-NSGA-II (Algorithm 2) and whose offspring are created as shown in Algorithm 6. The main difference resides in the creation of the offspring individuals. As we will use the projected search direction v˜P from (28), IG-NSGA-II/GSA is capable of handling inequality constrained MOPs.

Algorithm 6 Creation of an offspring using IG-NSGA-II/GSA

4 Experimental Results

In this section, we present different experiments dedicated to show the strength of GSA as an effective tool within multi-objective memetic algorithms. For this, we will first consider shortly the GSA as standalone algorithm.

Further on, we will consider and compare the two newly proposed memetic algorithms MOEA/D/GSA and IG-NSGA-II/GSA on several bechmark functions, where we will consider (a) unconstained MOPs, (b) MOPs with relatively easy constraints, and (c) MOPs with complex constraints.

4.1 GSA within Laras Method

First we investigate one line search method, namely the iteration:

xi+1=xi+tivL,i, (34)

where vL,i is the GSA-approximation of the descent direction (25) at the current iterate xi and where ti is the step size control as described above. As example we consider the unconstrained convex bi-objective problem:

fj(x)=∑i=1n‖xi−aji‖22, j=1,2, (35)

where a1=(1,…,1)T∈ℝn and a2=(−1,…,−1)T∈ℝn. For the value of n we have chosen 10.

First we investigate the influence of the number r of sampling points. Figure 6 presents the result of the following experiment: we have taken x0∈ℝ10 at random and have generated a set of r neighorhood samples from Nδ(x0) for δ=0.15 for r=3, 5, and 7.

Fig. 6 Results of the approximation using several values of r

The figure shows the points y0:=F(x0), the iterate y1:=F(x0+vL) (i.e., the image of the iterate x1:=x0+vL for the step size t=1) and further the approximations of y1 when using GSA and r samples. As it can be seen, already for r=5 a significant move toward the Pareto front can be obtained, and for r=7 the performance comes quite close to the usage of the exact gradient.

Since the consideration of one step has no significance, we consider an entire trajectory of solutions starting with x0 using r=5. In Figure 7 a sequence of 15 iterations is shown. As it can be seen, the iterations come close to the Pareto front.

Fig. 7 Standalone GSA applied on quadratic function

As already mentioned above, more theoretical investigations are required to fully understand GSA based local search which we, however have to leave to future research. In the following we focus on the effect of the GSA based local search engines within memetic multi-objective evolutionary algorithms.

4.2 MOEA/D/GSA

For this experiment we use the original version of the MOEA/D algorithm as it was presented in [⁴²]. A comparison using two different state-of-the-art indicators is performed. The Δ2 indicator [³⁵] and the hypervolume indicator [⁴⁴] are selected for the performance assessment. Both indicators measure spread and convergence along the Pareto front to a certain extent. We propose to compare the algorithms (the standalone and the memetic version), after they have spent a certain number of function calls. We perform our comparison when the algorithms reached 30,000 function evaluations for the functions with two objectives and 50,000 for k=3. We stress that MOEA/D is not explicitly aiming at one of any existing performance indicators. Instead, it measures the success via improvements in the scalarization functions.

Hence, along with the two state-of-the-art indicators, we propose a specialized indicator that measuares the averaged scalar value of the whole population (and thus, in a sense the overall success of the MOEA/D variants). In particular, we propose to use the indicator W that is defined as follows:

W(P)=∑iNT(F(xi),wi,z), (36)

where xi∈P is the best found individual for the i-th subproblem. We introduce this new indicator, as we have observed that if W decreases (and thus, in average also the best found solutions for all sub-problems of MOEA/D), this does not necessarily yield better hypervolume nor Δp values. In other words, the sub-problems are not matched to these (or any other), performance indicators.

To confirm that the GSA method can really improve a memetic algorithm it becomes necessary to make a comparison on different test functions. In particular, for this memetic strategy we selected two of the state-of-the-art benchmark suites. The first selected set of functions is the modified version of the ZDT functions proposed in [³⁹]. For the second part of the experiments we use the DTLZ benchmark for k=3 [⁸].

Table 1 presents the parameters used for each one of these benchmarks.

We performed an experiment in order to compare the performance of the memetic algorithm based on the GSA as a local searcher. For each comparison we observed the results over 30 independent runs. The experiments were performed over the ZDT test functions and the DTLZ test functions. The effectiveness of the MOEA/D/GSA was measured comparing the memetic algorithm along with two other methods: the base variant of MOEA/D and a memetic strategy based on the Nelder-Mead method. In these experiments we use a modified version of the Nelder-Mead method as follows: we modified the algorithm to incorporate neighboring information on it (as the GSA does). In particular, we incorporate r individuals from the population into the construction of the simplex. In case that r<(n+1), we computed the remaining n−r+1 using the mechanisms proposed in the original Nelder-Mead algorithm. The resulting algorithm is termed here MOEA/D/NM.

Tables 2 to 6 present the results obtained by the three algorithms on the three different performance indicators. The nadir points for the hypervolume indicator are set as follows: (5,5)T for the ZDT test functions, (11,11)T for DTLZ 1-4 and (5,5,5)T for DTLZ 5-7. The tables present the results at a certain stage of the algorithm. That is, we measured the indicators after a certain number of function calls. For the ZDT test function we measure the results after 15,000 function evaluations. Meanwhile, for the DTLZ functions we obtained the results after 30,000 function evaluations. Bold numbers indicate that the algorithm is outperforming the other ones significantly. As it can be seen, MOEA/D/GSA wins in most cases. For instance, for the W indicator, MOEA/D/GSA significantly wins in 8 out of 12 cases, and is only outperformed on ZDT3 by MOEA/D/NM.

Figure 8 presents the computed Pareto fronts for the best individual on each of the ZDT problems.

Fig. 8 Pareto fronts of ZDT problems

Here, it is possible to observe that in most of the cases the GSA helps the algorithm to achieve a ”better” approximation along the entire Pareto front.

4.3 IG-NSGA-II/GSA

The next set of experiments is prepared in order to demonstrate that the GSA can be used to handle constrained problems. Since we are replacing the local search method of the IG-NSGA-II we adopt all the parameters from such algorithm. First, we consider some of the function presented in [⁵]. The definitions of such functions can be found in Table 9. The maximization problems presented on the definitions were transformed into minimization problems. The constraints of the problems were also transformed into the form a(x)≤0. We stress that all constraints are relatively easy in the sense that the feasible set is not of a highly complex structure. Table 4 presents the values for the parameters for the memetic algorithm.

Table 9 Constrained Problems Definition

Function n	Definition	Constraints
Belegundu n=2	f1(x)=−2x1+x2f2(x)=2x1+x2	−x1+x2−1≤0x1+x2−7≤00≤x1≤50≤x2≤3
Binh(2) n=2	f1(x)=4x12+4x22f2(x)=(x1−5)2+(x2−5)2	(x1−5)2+x22−25≤0−(x1−8)2−(x2+3)3+7.7≤00≤x1≤50≤x2≤3
Binh(4) n=2	f1(x)=1.5−x1(1−x2)f2(x)=2.25−x1(1−x22)f3(x)=2.625−x1(1−x23)	−x12−(x2−0.5)2+9≤0(x1−1)2+(x2−0.5)2−6.25≤0−10≤x1≤10−10≤x2≤10
Obayashi n=2	Maximizef1(x)=x1f2(x)=x2f2(x)=x2	x12+x22≤10≤x1≤10≤x2≤1
Ozyczka n=2	f1(x)=x1+x22f2(x)=x12+x2	12−x1−x2≥0x12+10x1−x22+16x2−80≥02≤x1≤75≤x2≤10
Ozyczka 2 n=6	f1(x)=−(25(x1−2)2+(x2−2)2+(x3−1)2+(x4−4)2+(x5−1)2)f2(x)=x12+x22+x32+x42+x52+x62	x1+x2−2≥06−x1−x2≥02−x2+x1≥02−x1+3x2≥04−(x3−3)2−x4≥0(x5−3)2+x6−4≥00≤x1,x2,x6≤101≤x3,x5≤50≤x4≤6
Srinivas n=2	f1(x)=(x1−2)2+(x2−1)2+2f2(x)=9x1−(x2−1)2	x12+x22−225≤0x1−3x2+10≤0−20≤x1,x2≤20
Tamaki n=3	Maximizef1(x)=x1f2(x)=x2f3(x)=x3	x12+x22+x3≤1x1,x2,x3≥0
Tanaka n=2	f1(x)=x1f2(x)=x2	−x12−x22+1+0.1 cos (16 arctan⁡(x1x2))≤0(x1−0.5)2+(x2−0.5)2≤0.50<x1,x2≤π
Viennet (4) n=2	f1(x)=(x1−2)22+(x2+1)213+3f2(x)=(x1+x2−3)2175+(2x2−x1)217−13f3(x)=(3x1−2x2+4)28+(x1−x2+1)227+15	−4x1−x2+4≥0x1+1≥0x2−x1+2≥0−4≤x1,x2≤4

The results obtained in Table 5 show that the GSA can improve the results in most of the functions according to the Δ2 indicator. Table 5 also presents the function evaluation used as stopping criteria. Moreover, the table also presents the nadir point used to compute the hypervolume indicator. It is important to mention that some indicator values obtained by the GSA outperformed significantly the standalone algorithm. Taking these results into consideration, now we are in position to confirm the efficiency of the GSA when it is used within a memetic strategy.

As a last experiment we use the constrained CF functions proposed in [⁴³] those constraints can considered to be complex. We perform a similar comparison as in the previous method. We measure the Δ2 and the hypervolume indicators at certain stage of the evolution (i.e. at 30,000 and 50,000 function evaluations). For the experiments we set the nadir point for hypervolume as (5,5)T∈ℝ2 and (5,5,5)T∈ℝ3. Table 7 presents the averaged results measured by the proposed indicators. The results are obtained taking in consideration only the feasible solutions obtained by the algorithms. By such reason the function CF10 is not in the statistical results since neither of the algorithms obtained feasible solutions.

Figure 9 presents some of the Pareto fronts obtained by the algorithms on the CF functions.

Fig. 9 Pareto fronts of CF problems

5 Conclusions and Future Work

In this paper, we have argued that the gradient subspace approximation (GSA) is a powerful tool for local search within memetic algorithms for the numerical treatment of multi-objective optimization problems (MOPs). The GSA utilizes existing neighborhood information I from a given candidate solution and is able to approximate the best approximation of the gradient within the subspace of the decision variable space that is defined by I. Thus, the computation of the search direction via GSA comes ideally for free for population based search algorithms such as evolutionary algorithms. The strengths of GSA within memetic algorithms for the treatment of scalar optimization problems have recently been reported in [³³]. Here, we have discussed the GSA for the case that multiple objectives have to be considered concurrently. To this end, we have

— empirically shown that the existing neighbor-hood information within populations of multi-objective evolutionary algorithms is sufficient for the application of GSA,
— discussed how to approximate the Jacobian matrix at a given candidate solution x0 via GSA,
— discussed how to adapt Laras search direction in case inequality constraints are at hand and how to choose the step size control,
— proposed two particular GSA-based memetic algorithms, and have presented numerical results on some selected benchmark problems. In both cases the application of the local search-that comes basically for free in terms of additional function calls-significantly improved the performance of the base algorithm. More precisely, the increase of the performance was significant for unconstrained problems and for the MOPs with relatively easy constraints. For complex constraints, no such significant performance improvements could be obtained so far.

In conclusion, this first study is very promising and will encourage us for future research in this direction. This will include the development of more sophisticated constraint handling techniques which are mandatory to extend the applicability of the algorithms to more complex problems. Further, a fine tuning and a more advanced interplay of local and global search elements will be helpful to increase the overall performance.

Finally, one interesting next step would be to use GSA-based local search within indicator based evolutionary algorithms as such methods naturally define a (population based) scalar optimization problem that is induced by the indicator.

Acknowledgements

The authors acknowledge funding from Conacyt project no. 285599 ”Toma de decisiones multiobjetivo para sistemas altamente complejos”.

References

1 . Beume, N., Naujoks, B., & Emmerich, M. (2007). SMS-EMOA: Multiobjective selection based on dominated hypervolume. European Journal of Operational Research, Vol. 181, No. 3, pp. 1653 - 1669. [ Links ]

2 . Bhuvana, J., & Aravindan, C. (2016). Memetic algorithm with preferential local search using adaptive weights for multi-objective optimization problems. Soft Computing, Vol. 20, No. 4, pp. 1365-1388. [ Links ]

3 . Bosman, P. A. N (2012). On gradients and hybrid evolutionary algorithms for real-valued multiobjective optimization. IEEE Transactions on Evolutionary Computation, Vol. 16, No. 1, pp. 51-69. [ Links ]

4 . Brown, M., & Smith, R. E. (2005). Directed multi-objective optimization. International Journal of Computers, Systems, and Signals, Vol. 6, No. 1, pp. 3-17. [ Links ]

5 . Coello, C. A. C., Lamont, G. B., & Veldhuizen, D. A. V. (2007). Evolutionary Algorithms for Solving Multi-objective Problems, volume 5. Springer. [ Links ]

6 . Das, I., & Dennis, J. E. (1998). Normal-Boundary Intersection: A New Method for Generating the Pareto Surface in Nonlinear Multicriteria Optimization Problems. SIAM Journal on Optimization, Vol. 8, No. 3, pp. 631-657. [ Links ]

7 . Deb, K (2001). Multi-Objective Optimization Using Evolutionary Algorithms. Wiley Interscience Series in S. Wiley. [ Links ]

8 . Deb, K., Thiele, L., Laumanns, M., & Zitzler, E. (2002). Scalable multi-objective optimization test problems. Evolutionary Computation, 2002. CEC’02. Proceedings of the 2002 Congress on, volume 1, IEEE, pp. 825-830. [ Links ]

9 . Dellnitz, M., Schütze, O., & Hestermeyer, T. (2005). Covering Pareto sets by multilevel subdivision techniques. Journal of Optimization Theory and Applications, Vol. 124, No. 1, pp. 113-155. [ Links ]

10 . Drummond, L. M. G., & Svaiter, B. F. (2005). A steepest descent method for vector optimization. Journal of Computational and Applied Mathematics, Vol. 175, pp. 395-414. [ Links ]

11 . Fernández, J., Schütze, O., Hernández, C., Sun, J.-Q., & Xiong, F.-R. (2016). Parallel simple cell mapping for multi-objective optimization. Engineering Optimization, Vol. 48, No. 11, pp. 1845-1868. [ Links ]

12 . Hernández, C., Naranjani, Y., Sardahi, Y., Liang, W., Schütze, O., & Sun, J.-Q. (2013). Simple cell mapping method for multi-objective optimal feedback control design. International Journal of Dynamics and Control, Vol. 1, No. 3, pp. 231-238. [ Links ]

13 . Hernández, V. A. S., Schütze, O., Rudolph, G., & Trautmann, H. (2016). The hypervolume based directed search method for multi-objective optimization problems. Journal of Heuristics, Vol. 22, No. 3, pp. 273-300. [ Links ]

14 . Hillermeier, C (2001). Nonlinear Multiobjective Pptimization: a Generalized Homotopy Approach, volume 135. Springer Science & Business Media. [ Links ]

15 . Ishibuchi, H., Yoshida, T., & Murata, T. (2003). Balance between genetic search and local search in memetic algorithms for multiobjective permutation flowshop scheduling. IEEE Transactions on Evolutionary Computation, Vol. 7, No. 2, pp. 204-223. [ Links ]

16 . Jahn, J (2006). Multiobjective search algorithm with subdivision technique. Computational Optimization and Applications, Vol. 35, No. 2, pp. 161-175. [ Links ]

17 . LaTorre, A., Muelas, S., & Peña, J.-M. (2011). A MOS-based dynamic memetic differential evolution algorithm for continuous optimization: A scalability test. Soft Computing, Vol. 15, No. 11, pp. 2187- 2199. [ Links ]

18 . Lewis, R. M., Torczon, V., & Trosset, M. W. (2000). Direct search methods: Then and now. Journal of Computational and Applied Mathematics, Vol. 124, No. 1, pp. 191-207. [ Links ]

19 . Liu, T., Gao, X., & Yuan, Q. (2016). An improved gradient-based NSGA-II algorithm by a new chaotic map model. Soft Computing. [ Links ]

20 . López, A. L., Coello, C. A. C., & Schütze, O. (2010). A painless gradient-assisted multi-objective memetic mechanism for solving continuous bi-objective optimization problems. IEEE Congress on Evolutionary Computation, IEEE, pp. 1-8. [ Links ]

21 . Lopez, E. M., & Coello, C. A. C. (2016). A Parallel Multi-objective Memetic Algorithm Based on the IGD+ Indicator. International Conference on Parallel Problem Solving from Nature, Springer, pp. 473-482. [ Links ]

22 . Martín, A., & Schütze, O. (2017). Pareto tracer: a predictor-corrector method for multi-objective optimization problems. Engineering Optimization (to appear). [ Links ]

23 . Martin, B., Goldsztejn, A., Granvilliers, L., & Jermann, C. (2014). On continuation methods for non-linear bi-objective optimization: towards a certified interval-based approach. Journal of Global Optimization, Vol. 64, No. 1, pp. 1-14. [ Links ]

24 . Martin, B., Goldsztejn, A., Granvilliers, L., & Jermann, C. (2016). On continuation methods for non-linear bi-objective optimization: towards a certified interval-based approach. Journal of Global Optimization, Vol. 64, No. 1, pp. 3-16. [ Links ]

25 . Martínez, S. Z., & Coello, C. A. C. (2012). A direct local search mechanism for decomposition-based multi-objective evolutionary algorithms. 2012 IEEE Congress on Evolutionary Computation, pp. 1-8. [ Links ]

26 . Miettinen, K (2012). Nonlinear Multiobjective Optimization, volume 12. Springer Science & Business Media. [ Links ]

27 . Naranjani, Y., Hernández, C., Xiong, F.-R., Schütze, O., & Sun, J.-Q. (2016). A hybrid method of evolutionary algorithm and simple cell mapping for multi-objective optimization problems. International Journal of Dynamics and Control, pp. 1-13. [ Links ]

28 . Nocedal, J., & Wright, S. (2006). Numerical Optimization. Springer Series in Operations Research and Financial Engineering. Springer. [ Links ]

29 . Pareto, V (1896). Cours D’Economie Politique. F. Rouge, Switzerland. [ Links ]

30 . Pereyra, V., Saunders, M., & Castillo, J. (2013). Equispaced Pareto front construction for constrained bi-objective optimization. Math Comput Model, Vol. 57, No. 9-10, pp. 2122-2131. [ Links ]

31 . Rakowska, J., Haftka, R. T., & Watson, L. T. (1993). Multi-objective control-structure optimization via homotopy methods. SIAM Journal on Optimization, Vol. 3, No. 3, pp. 654-667. [ Links ]

32 . Ringkamp, M., Ober-Blöbaum, S., Dellnitz, M., & Schütze, O. (2012). Handling high dimensional problems with multi-objective continuation methods via successive approximation of the tangent space. Engineering Optimization, Vol. 44, No. 6, pp. 1117-1146. [ Links ]

33 . Schütze, O., Alvarado, S., Segura, C., & Landa, R. (2016). Gradient subspace approximation: A direct search method for memetic computing. Soft Computing, pp. 1-20. [ Links ]

34 . Schütze, O., Dell’Aere, A., & Dellnitz, M. (2005). On continuation methods for the numerical treatment of multi-objective optimization problems. Dagstuhl Seminar Proceedings, Schloss Dagstuhl-Leibniz-Zentrum für Informatik. [ Links ]

35 . Schütze, O., Esquivel, X., Lara, A., & Coello, C. A. C. (2012). Using the averaged Hausdorff distance as a performance measure in evolutionary multiobjective optimization. IEEE Transactions on Evolutionary Computation, Vol. 16, No. 4, pp. 504- 522. [ Links ]

36 . Schütze, O., Lara, A., & Coello, C. A. (2011). On the influence of the number of objectives on the hardness of a multiobjective optimization problem. IEEE Transactions on Evolutionary Computation, Vol. 15, No. 4, pp. 444-455. [ Links ]

37 . Schütze, O., Lara, A., Sanchez, G., & Coello, C. A. (2010). HCS: A new local search strategy for memetic multiobjective evolutionary algorithms. IEEE Transactions on Evolutionary Computation, Vol. 14, No. 1, pp. 112-132. [ Links ]

38 . Schütze, O., Martín, A., Lara, A., Alvaradoa, S., Salinas, E., & Coello, C. A. C. (2016). The directed search method for multiobjective memetic algorithms. Journal of Computational Optimization and Applications, Vol. 63, pp. 305-332. [ Links ]

39 . Shukla, P. K (2007). On Gradient Based Local Search Methods in Unconstrained Evolutionary Multi-objective Optimization. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 96-110. [ Links ]

40 . Sindhya, K., Miettinen, K., & Deb, K. (2013). A hybrid framework for evolutionary multi-objective optimization. IEEE Transactions on Evolutionary Computation, Vol. 17, No. 4, pp. 495-511. [ Links ]

41 . Yu, G., Chai, T., & Luo, X. (2011). Multiobjective production planning optimization using hybrid evolutionary algorithms for mineral processing. IEEE Transactions on Evolutionary Computation, Vol. 15, No. 4, pp. 487-514. [ Links ]

42 . Zhang, Q., & Li, H. (2007). MOEA/D: A multiobjective evolutionary algorithm based on decomposition. IEEE Transactions on Evolutionary Computation, Vol. 11, No. 6, pp. 712-731. [ Links ]

43 . Zhang, Q., Zhou, A., Zhao, S., Suganthan, P. N., Liu, W., & Tiwari, S. (2008). Multiobjective optimization test instances for the CEC 2009 special session and competition. University of Essex, Colchester, UK and Nanyang technological University, Singapore, special session on performance assessment of multi-objective optimization algorithms, technical report, Vol. 264. [ Links ]

44 . Zitzler, E., & Thiele, L. (1999). Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach. IEEE Transactions on Evolutionary Computation, Vol. 3, No. 4, pp. 257-271. [ Links ]

Received: February 01, 2017; Accepted: September 06, 2017

Corresponding author is Sergio Alvarado. salvarado@computacion.cs.cinvestav.mx, schuetze@cs.cinvestav.mx, carlos.segura@cimat.mx, saul.zapotecas@gmail.com.

This is an open-access article distributed under the terms of the Creative Commons Attribution License