A Generalization of the Averaged Hausdorff Distance

Vargas, Andrés; Bogoya, Johan; Vargas, Andrés; Bogoya, Johan

doi:10.13053/cys-22-2-2950

Services on Demand

Journal

Article

Indicators

Cited by SciELO
Access statistics

Computación y Sistemas

On-line version ISSN 2007-9737Print version ISSN 1405-5546

Comp. y Sist. vol.22 n.2 Ciudad de México Apr./Jun. 2018 Epub Jan 21, 2021

https://doi.org/10.13053/cys-22-2-2950

Articles of the Thematic Issue

A Generalization of the Averaged Hausdorff Distance

Andrés Vargas¹

Johan Bogoya¹

¹Pontificia Universidad Javeriana, Bogotá, Colombia.

Abstract:

The averaged Hausdorff distance Δp is an inframetric which has been recently used in evolutionary multiobjective optimization (EMO). In this paper we introduce a new two-parameter performance indicator Δp,q which generalizes Δp as well as the standard Hausdorff distance. For p,q≥1 the indicator Δp,q (that we call the (p,q)-averaged distance) turns out to be a proper metric and preserves some of the Δp advantages. We proof several properties of Δp,q, and provide a comparison with Δp and the standard Hausdorff distance. For simplicity we restrict ourselves to finite sets, which is the most common case, but our results can be extended to the continuous case.

Keywords: Averaged Hausdorff distance; generational distance; inverted generational distance; multiobjective optimization; performance indicator; power means

1 Introduction

In most cases, the solution of a multiobjective optimization problem (MOP), is a subset of ℝn called Pareto set (P). Sometimes P can be computed analytically but, typically, the use of a numerical algorithm is necessary to find a reasonable finite size approximation.

To establish the accuracy of an EMO-algorithm trying to approximate this Pareto set or its image, the Pareto front (PF), we can measure the distance between the algorithm outcome set A and the respective Pareto set or front. Since, in general, a specific distance to PF can be attained for different sets A, this method will not produce unique solutions.

For any metric space X, the standard Hausdorff distance dH (see [³, ⁶]) is a metric for 𝒫c(X) (the family of all possible compact subsets of X). Intuitively, if dH(X,Y), is small it means that every x∈X is close to some y∈Y and vice versa. The metric dH is used in Brownian motion [¹⁵], matrix theory [¹], dynamical systems [²], or fractal geometry [⁷], among other research areas. In the theory of evolutionary multiobjective optimization (EMO), the closeness of a set A to certain PF determines the approximation (called convergence in the EMO literature) of the outcome, and the closeness of PF to A determines the spread (maximal gap).

The metric dH is rarely used by the EMO community because its values allow for undesirable ambiguities. An illustrative example is that a large value of dH(PF,A), can indicate both: a “bad” approximation or a “good” one with at least one outlier (see Figure 1). Instead of dH, several alternative indicators have been introduced, e.g. the hypervolume indicator [¹⁷], the R indicators [¹⁰], or the averaged Hausdorff distance Δp introduced in [¹³]. The adequacy of the use of different indicators has been studied in [¹⁴]. Among them, the performance indicator Δp has the advantage of not punishing heavily the outliers and to produce solutions evenly spread around PF (which is highly desirable [¹⁶]), but the disadvantage of not satisfying the triangle inequality. In other words, Δp is not a metric but a semimetric with relaxed triangle inequality, which we will refer to as an inframetric. This terminology, explained with more detail at the end of the Section 2, does not conflict with related (but different) notions and it has also been used in computer science (see [⁸]).

Fig. 1 A hypothetical Pareto front discretization P′ (black circles) and two different archives: X1 (blue dots) and X2 (orange squares)

In this paper we introduce a new indicator Δp,q, that we call the (p,q)-averaged Hausdorff distance, or simply (p,q)-averaged distance for brevity.

When p,q≥1 this distance turns out to be a proper metric and preserves the principal advantages of Δp. When |p|, |q| ≥1 (but not p,q≥1, we show that Δp,q satisfies a relaxed triangle inequality, turning it into an inframetric. Moreover Δp,q is related to the p-averaged Hausdorff distance Δp and the standard Hausdorff distance dH in the following way:

Δp,−∞(⋅,⋅):=lim⁡q→−∞Δp,q(⋅,⋅)=Δp(⋅,⋅), (1.1)

and

:Δ∞,−∞(⋅,⋅):=lim⁡p→∞q→−∞Δp,q(⋅,⋅)=dH(⋅,⋅). (1.2)

The remainder of this paper is organized as follows: Section 2, presents some basic preliminaries, including well-known properties of q-power means, the Generational Distance GD, the Inverted Generational Distance IGD, and their p-averaged versions GDp and IGDp, respectively. This section concludes with a review of the inframetric properties of the p-averaged Hausdorff distance Δp. Section 3, presents GDp,q and IGDp,q which are modifications of GDp and IGDp, respectively. We introduce here the (p,q)-averaged distance Δp,q and prove several properties, including a result related to Pareto-compliance. Section 4, presents some numerical results showing the behavior of Δp,q as an indicator. Finally, in Sections 5 and 6, we present our conclusions and future work proposals, respectively.

2 Preliminaries

2.1 Multiobjective Optimization

For a vector valued function f:X⊂ℝn→ℝℓ, the multiobjective optimization problem (MOP), under consideration requires the simultaneous minimization of its ℓ component functions f1,…,fℓ. A solution is optimal when the elements of the image Y=f(X), are nondominated in the sense of Pareto [¹¹], which derives from a partial order in ℝℓ, whose definition we recall below for the convenience of the reader.

Let (Y,≼) be a subset Y⊂ℝℓ equipped with the partial order ≼ defined for y,y′∈Y by:

y≼y′if and only ifyi≼y′ifor alli=1,…,ℓ.

An element y∈Y is said to be dominated byy′∈Y and denoted y′≺y, if y′≼y and y≠y′. Moreover, y∈Y is dominated byY′⊂Y, written Y′≼y, if there exists some y′∈Y′ such that y′≼y, otherwise it is said to be nondominated by it, Y′≼y. A subset Y′⊂Y is dominated by a subset Y″⊂Y, and written Y″≼Y′, if for every y′∈Y′ there exist some y″∈Y″ such that y″≼y′. If this is not the case Y′ is said to be nondominated by Y″ and denoted Y′≼Y″.

If Y=f(X) is the objective space of some MOP with decision space X⊂ℝn and objective function f:X→ℝℓ, its Pareto front is defined as the set Y*:={y∈Y|∃ y′∈Y:y′≺y} of nondominated elements. An element x∈X is called Pareto-optimal if its image is nondominated, i.e., f(x)∈Y*, and the set X* of all Pareto-optimal points is called Pareto set.

Finally, if Y⊂ℝℓ, we say that a performance indicator given by a function ℐ:𝒫(Y)→ℝ is Pareto-compliant if for subsets A,B⊂Y the strict dominance condition A≼B and B≼A implies the relation ℐ(A)≤ℐ(B) (or in a stronger version ℐ(A)<ℐ(B)). We refer the reader to [¹⁸] for details. In section 3.2 we provide a brief discussion of the compliance of the indicator associated to Δp,q with Pareto-related optimality criteria.

2.2 Power Means

For a finite set X={xi}i=1N⊂[0,∞) and a non-zero real q, the q-average or the q-power mean of X is given by:

ℳq(X):=(1N∑i=1Nxiq)1q.

Remark. When the set of values taken by an indexing quantity has been explicitly specified, e.g. i∈I:={1,…,N}, for convenience, the following abbreviated notation will be used:

ℳqi∈I(xi):=ℳq({xi}i∈I)=ℳq(X).

A comprehensive reference on the theory and properties of means is e.g. [⁵], where proofs of the statements presented in this section can be found.

It is well-known that limit cases of power means recover familiar quantities, for example:

lim⁡q→0ℳq(X)=(∏i=1Nxi)1N, is the standard geometric mean of X, additionally:

lim⁡q→∞ℳq(X)=max⁡{x1,…,xN}, andlim⁡q→−∞ℳq(X)=min⁡{x1,…,xN}.

The special case q=−1 corresponds to the harmonic mean and it is also of our interest:

harm(X):=ℳ−1(X).

Now, we can define the q-average of a finite set for any q in the extended real line ℝ¯:=[−∞,∞]. Let Y:={yi}i∈I be a finite subset of [0,∞). The following properties hold for power means:

1. If xi≤yi for all indices i∈I, then for any q∈ℝ¯:

ℳq(X)≤ℳq(Y). (2.1)

2. For p,q∈ℝ¯, if p≤q, then:

ℳp(X)≤ℳq(X). (2.2)

3. If A=(aij) denotes an array of indexed positive elements with i∈I and j∈J, then their p-average satisfies:

ℳp(A)=ℳpi∈I(ℳpj∈J(aij))=ℳpj∈J(ℳpi∈I(aij)). (2.3)

4. For p≥1 it follows from Minkowski inequality that:

ℳpi∈I(xi+yi)≤ℳpi∈I(xi)+ℳpi∈I(yi). (2.4)

5. The harmonic mean admits the bound:

harm(X)≤Nmin⁡(X). (2.5)

2.3 Averaged Hausdorff Distance

Suppose that A and B belong to the family of finite subsets of ℝn, denoted by 𝒫0(ℝn), and that p≥1. Recall that the “modified” generational distance (GDp) and the inverted generational distance (IGDp) are defined by power means as follows (see [¹³]):

GDp(A,B):=(1NA∑i=1NAd(ai,B)p)1p,=(1NA∑i=1NAmin⁡j=1…NB{d(ai,bj)p})1p,

and:

IGDp(A,B):=GDp(B,A).

where d(⋅,⋅) stands for the standard Euclidean metric. Let us denote by Δp:𝒫0(ℝn)×𝒫0(ℝn)→ℝ the so-called averaged Hausdorff distance, i.e.:

Δp(A,B):=max⁡{GDp(A,B),IGDp(A,B)}.

It has been established in [¹³] that Δp does not satisfy the triangle inequality but a weaker version given by:

Δp(A,C)≤N1p(Δp(A,C)+Δp(B,C)),

where N≥1 is a constant with |A|,|B|,|C|≤N. Because of this, Δp is not a proper metric but a semimetric with relaxed triangle inequality. This notion has appeared with several conflicting names in the literature, among which inframetric is probably the one with less friction with pre-existing terminology.

There are two related conditions that relax the triangle inequality for a function d:X×X→[0,∞). Namely, the existence of a constant C>0 such that for any points a,b,c∈X one of the following properties hold:

1. The C-relaxed triangle inequality:

d(a,b)≤C(d(a,c)+d(c,b)).

2. The C-inframetric inequality:

d(a,b)≤Cmax⁡{d(a,c)+d(c,b)}.

Since condition (2) implies (1), with the same constant C and, reciprocally, the C-relaxed triangle inequality implies the 2C-inframetric one, it is clear that both conditions are equivalent for appropriate choice of constants. Hence, a semimetric satisfying anyone of these conditions will be called an inframetric.

3 The (p, q)-Averaged Distance

In order to simplify the forthcoming calculations we use the following abbreviation:

|∑i=1Nxi:=1N∑i=1Nxi=ℳ1({x1,…,xN}),

to denote the average of x1,…,xN∈[0,∞].

Definition 3.1. For p,q∈ℝ\{0}, the generational (p,q)-distance GDp,q(A,B) between two finite sets A={ai}i=1NA and B={bj}j=1NB in ℝn is given by:

GDp,q(A,B):=(|∑i=1NA(|∑j=1NBd(ai,bj)q)pq)1p.

When p<0 or q<0 it will always be assumed for consistency that A∩B=∅. The indicator GDp,q(A,B) can be extended for values of p=0 and/or q=0 by taking the limits when p→0 and/or q→0, respectively. In such cases, the properties mentioned in the previous section suggest the following appropriate definitions:

GDp,0(A,B):=(|∑i=1NA(∏j=1NBd(ai,bj))pNB)1p,

for p≠0,

GD0,q(A,B):=(∏i=1NA(|∑j=1NBd(ai,bj)q)1q)1NA,

for q≠0, and

GD0,0(A,B):=(∏i=1NA(∏j=1NBd(ai,bj))1NB)1NA.

We can also calculate GDp,q when p→±∞ and/or q→±∞, by changing the respective sum for a minimum or a maximum. In particular, if A∩B=∅, we have the nice relation:

lim⁡q→−∞GDp,q(A,B)=GDp(A,B). (3.1)

Note that the definition of GDp,q has two drawbacks, namely GDp,q(A,B) does not necessarily vanish if A=B, and in general GDp,q(A,B)≠GDp,q(B,A), thus this indicator does not define a metric. In order to get a proper metric we introduce the following notion.

Definition 3.2. The (p,q)-averaged distance is the map Δp,q:𝒫0(ℝn)×𝒫0(ℝn)→ℝ given by:

Δp,q(A,B):=max⁡{GDp,q(A,B\A),GDp,q(B,A\B)}.

If A∩B=∅ then GDp(A,B)=GDp(A,B\A), therefore using (3.1) and Definition 3.2 we easily obtain:

lim⁡q→−∞Δp,q(A,B)=Δp(A,B). (3.2)

More generally, Δp,−∞(A,B)≥Δp(A,B) always holds. We point out that similarly to the relation:

GDp(A,B)=NA−1p∥DAB∥p,

between the generational distance GDp(A,B) and the matrix ℓp-norm of the distance matrix (DAB)ij:=d(ai,bj), we also have the following relation between the (p,q)-generational distance GDp,q(A,B) and the matrix ℓp,q-norm ∥DAB∥p,q (defined precisely as GDp,q but with standard sums Σ instead of the normalized ones Σ| see e.g. [⁹]):

GDp,q(A,B)=ℳ(ℳj∈J(d(ai,bj))),=NA−1pNB−1q∥DAB∥p,q, (3.3)

where I:={1,…,NA} and J:={1,…,NB}.

3.1 Metric Properties

Now we study the behavior of the distances GDp,q and Δp,q from a metric perspective. Since the elements of the distance matrix do not satisfy a relation of the kind DAC=DAB+DBC (and in general these matrices are of different sizes), their properties are not immediate from (3.3) and the triangle inequality for ℓp,q-norms. Here, we provide self-contained proofs for future reference and completeness.

Theorem 3.3. For parametersp,q∈[1,∞)the generational(p,q)-distanceGDp,qsatisfies the triangle inequality, namely:

GDp,q(A,C)≤GDp,q(A,B)+GDp,q(B,C),

for any finite setsA,B,C⊂ℝn.

Proof. Let assume that A={ai}i=1NA, B=(bj)j=1NB, and C=(ck)k=1NC. From the triangle inequality for the Euclidean metric d(⋅,⋅) on ℝn we know that for arbitrary values of the indices i∈I:={1,…,NA}, j∈J:={1,…,NB}, and k∈K:={1,…,NC} it holds that:

d(ai,ck)≤d(ai,bj)+d(bj,ck).

Let us abbreviate these quantities by δik:=d(ai,ck), δij:=d(ai,bj), and δjk:=d(bj,ck), where the indices i,j,k will be understood to take values in the sets I, J, and K defined above, respectively. Then, for any of them:

δik≤δij+δjk.

By (2.1), we can take the q-average over all k∈K at both sides of the previous inequality to obtain:

ℳqk∈K≤ℳqk∈K(δij+δjk),≤ℳqk∈K(δij)+ℳqk∈K(δjk),

where the last line follows by using Minkowski inequality (2.4) for q≥1. Since the averaged quantities in the first term are independent of k, we have for all i∈I and j∈J that:

ℳqk∈K(δik)≤δij+ℳqk∈K(δjk). (3.4)

Here, we consider two cases for the parameters p,q∈[1,∞) independently.

Casep≤q: Under this assumption, we take at both sides of (3.4) the p-average over all i∈I, and using (2.4) for p≥1 at the RHS (right-hand side), we get:

ℳpi∈I(ℳqk∈K(δik))≤ℳpi∈I(δij)+ℳpi∈I(ℳqk∈K(δjk)).

Due to the independence on i∈I of the p-averaged quantity in the last term at the RHS, this simplifies to:

ℳpi∈I(ℳqk∈K(δik))≤ℳpi∈I(δij)+ℳ(δjk),

the LHS (left-hand side) is precisely GDp,q(A,C). Now, we take at both sides the p-average over all j∈J, use that the LHS is independent of j, and employ (2.4) for p≥1 at the RHS, to find:

GDp,q(A,C)≤ℳpj∈J(ℳpi∈I(δij))+ℳpj∈J(ℳqk∈K(δjk)),≤ℳpi∈I(ℳpj∈J(δij))+ℳpj∈J(ℳqk∈K(δjk)),

where the interchange at the last line is allowed by property (2.3). Now, property (2.2) for p≤q ensures that in the first term at the RHS we can replace the inner ℳp by ℳq to get an equal or larger quantity, therefore:

GDp,q(A,C)≤ℳpi∈I(ℳqj∈J(δij))+ℳpj∈J(ℳqk∈K(δjk)),=GDp,q(A,B)+GDp,q(B,C),

which proves the claim.

Caseq≤p: Let us take at both sides of (3.4) the q-average over all j∈J. Using that the LHS is independent of j, and (2.4) for q≥1 at the RHS, we obtain:

ℳqk∈K(δik)≤ℳqj∈J(δij)+ℳqj∈J(ℳqk∈K(δjk)),≤ℳqj∈J(δij)+ℳpj∈J(ℳqk∈K(δjk)),

where the change of ℳq by ℳp in the last-term of the RHS is justified by property (2.2) for q≤p.

Finally, we take at both sides the p-average over all i∈I , employ (2.4) for p≥1 at the RHS and use that the last term is independent of i∈I, to write:

GDp,q(A,C)≤ℳpi∈I(ℳqj∈J(δij))+ℳpj∈J(ℳqk∈K(δjk)),=GDp,q(A,B)+GDp,q(B,C).

The corollary below states that the indicator Δp,q is a semimetric that becomes a metric if p,q≥1. When this is not the case but still |p|,|q|≥1, the theorem that follows assures that Δp,q is at least inframetric and provides the associated constants.

Corollary 3.4. Forp,q∈ℝ¯the(p,q)-averaged distanceΔp,qis a semimetric on disjoint members of the family𝒫0(ℝn) 𝒫0(ℝn)of finite subsets ofℝn. Moreover, forp,q∈[1,∞), the(p,q)-averaged distance is a proper metric on disjoint subsets ofA,B∈𝒫0(ℝn).

Proof. From Definition 3.2, it is easy to see that Δp,q(A,B)≥0 as well as:

Δp,q(A,B)=Δp,q(B,A),

for every A,B∈𝒫0(ℝn) and all p,q∈ℝ¯. Moreover, it is also clear from Definition 3.1 that GDp,q(A,B\A)=0 if and only if A=∅ or B⊆A, hence from Definition 3.2 we find, for A,B≠∅, that:

Δp,q(A,B)=0 if and only if A=B.

These properties show that Δp,q is a semimetric for any p,q∈ℝ¯.

Since the maximum of two functions satisfying the triangle inequality also satisfies it, Theorem 3.3 shows that Δp,q satisfies the triangle inequality for all p,q∈[1,∞) and the cases p or q equal to ∞ follow by taking the appropriate limits.

Theorem 3.5. For anyp,q∈ℝwith|p|,|q| >1the generational(p,q)-distanceGDp,qsatisfies a relaxed triangle inequality. Explicitly:

GDp,q(A,C)≤N1r(GDp,q(A,B)+GDp,q(B,C)),

for allA,B,C∈𝒫0(ℝn), any constantN≥1such that|A|,|B|,|C|≤Nand whereris given by:

1r:=1|p|+1|q|.

Proof. For arbitrary p≠0, let us assume that q<0, so that |q|=−q. We can write:

GDp,|q|(A,B)=(|∑i=1NA([|∑j=1NB[d(ai,bj)q]−1]−1)pq)1p,=(|∑i=1NA(NB−2harmj=1…NB{d(ai,bj)q})pq)1p,

which combined with the property (2.5) gives us:

GDp,|q|(A,B)=(|∑i=1NA([|∑j=1NB[d(ai,bj)q]−1]−1)pq)1p,=(|∑i=1NA(NB−2harmj=1…NB{d(ai,bj)q})pq)1p,

An analogous inequality holds for arbitrary q≠0 if we assume that p<0. Therefore, we can write in general:

GD|p|,|q|(A,B)≤N1rGDp,q(A,B),

where NA,NB≤N and:

r:={|min⁡{p,q}|if pq<0,12harm{|p|,|q|}if p<0,q<0.

Notice that in both cases r can be chosen to take the second value, i.e., 1r:=1|p|+1|q|, when the sharpness of the constants does not matter. Finally, when |p|,|q| ≥1 we employ the triangle inequality for GD|p|,|q| to onclude that:

GDp,q(A,C)≤GD|p|,|q|(A,C),≤GD|p|,|q|(A,B)+GD|p|,|q|(B,C),≤N1r(GDp,q(A,B)+GDp,q(B,C)),

which finishes the proof.

From Corollary 3.4 we know that Δp,q is a metric for p,q≥1. More generally, the following corollary states that Δp,q is an inframetric when p,q≥1.

Corollary 3.6. For anyp,q∈ℝwith|p|,|q| ≥1the(p,q)-averaged distanceΔp,qsatisfies the following relaxed triangle inequality on disjoint subsetsA,B,C∈𝒫0(ℝn):

Δp,q(A,C)≤N1r(Δp,q(A,B)+Δp,q(B,C)),

for any constantN≥1such that|A|,|B|,|C| ≤Nand where1r:=1|p|+1|q|.

Proof. The corollary follows immediately from Theorem 3.5 and Definition 3.2.

Theorem 3.7. LetA,B∈𝒫0(ℝn)and suppose thatp≤p′,q≤q′, then:

Δp,q(A,B)≤Δp′,q(A,B), andΔp,q(A,B)≤Δp,q′(A,B).

Proof. If follows easily from Definition 3.2 and two applications of property (2.2).

Corollary 3.8. LetA,B∈𝒫0(ℝn)and suppose thatp≤p′, then:

Δp(A,B)≤Δp′(A,B).

Proof. We obtain the corollary by using (3.2) and taking the limit q→−∞ in Theorem 3.7. For the convenience of the reader we present here an alternative, self-contained proof useful in the continuous case. Let X be a measure space with finite μ-measure. For any function f:X→ℂ in the Lebesgue space Lr(X) with r≥1, a simple modification of the Hölder inequality tells us that:

(∫X|f|dμ)r≤μ(X)r−1∫X|f|rdμ. (3.5)

For any p,p′∈ℝ with 1≤p≤p′ we have p′p≥1, so with r=p′p and f=gp in (3.5) we obtain:

(∫X|gp|dμ)1p≤μ(X)1p−1p′(∫X|gp′|dμ)1p′. (3.6)

Taking as X the set A={ai}i=1N, as g:ℝn→ℝ the function given by g(x):=d(x,B), and as μ the discrete measure μd on A:

μd(x):={1if x∈A,0if x∉A;

the inequality (3.6) becomes:

(|∑i=1Nd(ai,B)p)1p≤(|∑i=1Nd(ai,B)p)1p′.

That is GDp(A,B)≤GDp′(A,B), which easily implies Δp(A,B)≤Δp′(A,B).

3.2 Pareto-Compliance

A discussion of the Pareto-compliance for the averaged GDp, IGDp and Δp-indicators appeared in [^{13, Sec. III}]. Similar observations can be made for the corresponding (p,q)-indicators introduced in this work. Here we will concentrate only in providing a complete proof of a result (analogous to Proposition 3 there) that describes the behavior of the GDp,q-indicator. The assumptions required are stronger than the compliance notion defined in Section 2.1 but they are useful to understand what will be needed in a very general situation.

If a MOP problem has an associated objective function f:X⊂ℝn→ℝℓ whose objective space Y=f(X) has a Pareto front Y*, and A⊂Y denotes an approximating subset, the explicit GDp,q and Δp,q-performance indicators assigned to A are given, respectively, by:

ℐp,qGD(A):=GDp,q(A,Y*), andℐp,qΔ(A):=Δp,q(A,Y*).

For convenience, given x,y∈ℝℓ and q∈ℝ¯ we will abbreviate δq(x):=ℳy∈Y*q d(x,y).

Theorem 3.9. Letp,q∈ℝand suppose that a pair of distinct finite subsetsA,B⊂Ysatisfy:

A≼B, i.e.,∀b∈B, ∃a∈Asuch thata≼b.
∀a∈A, ∃b∈Bsuch thata≼band withδq(b)≤ℳp{δq(b′)|a≼b′,b′∈B}.
∃a∈A\B,∃b∈B\Asuch thata≺b.
∀a∈A, ∀b∈Bthe following property holds,

a≺b⇒δq(a)<δq(b).

thenℐp,qGD(A)<ℐp,qGD(B).

Proof. Let us assume that A={ai}i=1NA, where its elements are arranged in a nonincreasing order with respect to δq(ai), this means that:

i<j⇒δq(aj)≤δq(ai).

By conditions 1 and 2, we can decompose B into a partition B=B1∪⋯∪Bm of subsets defined recursively for i=1,…,m (with 1≤m≤NA), as:

Bi:={b∈B\(B0∪⋯∪Bi−1)|ai≼b},

with B0:=∅. Let Ni denote the size of Bi satisfying 1≤Ni≤NB=∑i=1mNi and arrange the elements of each subset Bi={bk(i)}k=1Ni in a nondecreasing order with respect to δq(bk(i)), i.e., for each fixed i:

k<k′⇒δq(bk(i))≤δq(bk′(i)).

Note that from the construction and condition 4, for i=1,…,m, we have that:

b∈Bi⇒δq(ai)≤δq(b).

In particular, for the first element b1(i) of each Bi, which minimizes the set {δq(b)|b∈Bi} we have δq(ai)≤δq(b1(i)). Moreover, condition 2 necessarily implies:

δq(b1(i))p≤|∑b∈B\Biδq(b)p. (3.7)

Due to the ordering of A and this observation,

ℐp,qGD(A)p=|∑i=1NAδq(ai)p≤|∑i=1mδq(ai)p,≤|∑i=1mδq(b1(i))p. (3.8)

But the inequality still holds if, for any given value of j=1,…,m, we replace at the RHS of (3.8) the element b1(j) in the j-th term by any other bk(j)∈Bj (with k=2,…,Nj) while keeping all the remaining terms fixed. Therefore, there are Nj possible choices for this element, and in consequence Nj different inequalities for any given j=1,…,m:

ℐp,qGD(A)p=1m(∑i=1i≠jmδq(b1(i))p+δq(bkj(j))p),

where kj=1,…,Nj. When varying j=1,…,m, this procedure yields a total of ∑j=1mNj=NB inequalities with the same LHS, and using (2.1) we can take the average of all of them. Since the LHS remains the same, we obtain:

ℐp,qGD(A)p≤1NB∑j=1m∑kj=1Nj1m(∑i=1i≠jmδq(b1(i))p+δq(bkj(j))p),=|∑j=1m1NB(∑i=1i≠jmNjδq(b1(i))p+∑kj=1Njδq(bkj(j))p).

Notice that conditions 3 and 4 imply that the previous inequality has to be strict since the LHS contains all the elements of A and the RHS all the elements of B.

Rearranging and counting the terms, we get that:

ℐp,qGD(A)p,<|∑j=1m1NB(∑i=1i≠jmNjδq(b1(i))p+∑kj=1mδq(bkj(j))p),=1m∑j=1m∑i=1i≠jmNjNBδq(b1(i))p+1NB|∑j=1m∑b∈Bjδq(b)p,

which after a reordering in the first term becomes:

=1m∑i=1m∑j=1j≠imNjNBδq(b1(i))p+1m|∑b∈Bδq(b)p,=1m∑i=1m(NB−NiNB)δq(b1(i))p+1m|∑b∈Bδq(b)p,

but using (3.7) we obtain the inequality:

Thus, ℐp,qGD(A)<ℐp,qGD(B) as expected.

The behavior of ℐp,qΔ is more delicate due to the definition of Δp,q, but for disjoint subsets not intersecting Y* similar arguments can be employed. We remark also that condition 3 is only needed to ensure an strict inequality in the conclusion. We would have obtained ℐp,qGD(B)p≤ℐp,qGD(B)p by removing condition 3 and weakening 4 to the condition:

4’. ∀a∈A, ∀b∈B:a≼b⇒δq(a)≤δq(b).

4 Numerical Experiments

4.1 Working with Δp,q

In this section we consider a hypothetical Pareto front P given by the line segment from (0,1) to (1,0) in ℝ2, i.e. the set of all points:

(t,1−t)∈ℝ2 for 0≤t≤1.

This is the same example considered in [^{13, p. 506}], and enables us to make a comparison with values of Δp. In order to use the (p,q)-averaged distance, we discretize P′, by taking 11 uniformly distributed points over P. We assume two archives: X1 is obtained from P′ by changing (0,1) for (0,10), including an outlier, and adding 110 to the remaining ordinates. X2 is obtained from P′ by adding 5 to each ordinate. See Figure 1.

From [¹³], we know that:

Δ∞(A,B):=lim⁡p→∞Δp(A,B)

coincides with the standard Hausdorff distance dH.

In this case:

Δ1(P′,X1)=0.9091,Δ1(P′,X2)=4.5412,dH(P′,X1)=9,dH(P′,X2)=5;

according to Corollary 3.8 and [^{13, p. 512}], these values must increase as p increases.

Tables 1 and 2 show that we can find values of p and q such that the (p,q)-averaged distance does not punish heavily the outliers, for example p=q=1 or p=1 and q=−1. We remark that the values of Δp,q(P′,X1) do not present a significative change under variations of q≤1 for a fixed p.

Table 1 Δp,q(P′,X1) for different values of p and q

p q	1	2	5	10	20
−∞	0.9091	2.7153	5.5714	7.0811	7.9831
−100	0.9272	2.7701	5.6839	7.2241	8.1443
−20	0.9537	2.8367	5.8202	7.3974	8.3396
−5	0.9895	2.8624	5.8705	7.4613	8.4117
−1	1.1131	2.8782	5.8848	7.4795	8.4322
1	1.3243	2.9112	5.8920	7.4886	8.4425
2	2.9277	2.9295	5.8956	7.4932	8.4476
5	5.820	5.8956	5.9063	7.5068	8.4630
10	7.4886	7.4932	7.5068	7.5292	8.4882

Table 2 Δp,q(P′,X2) for different values of p and q

p q	1	2	5	10	20
−∞	4.5412	4.5497	4.5751	4.6160	4.6867
−100	4.6442	4.6529	4.6790	4.7209	4.7933
−20	4.8425	4.8518	4.8795	4.9239	5.0003
−5	4.9624	4.9720	5.0007	5.0465	5.1250
−1	5.0008	5.0105	5.0394	5.0856	5.1646
1	5.0203	5.0301	5.0591	5.1055	5.1848
2	5.0301	5.0398	5.0690	5.1154	5.1949
5	5.0591	5.0690	5.0983	5.1450	5.2248
10	5.1055	5.1154	5.1450	5.1921	5.2725

Thus it is possible to work with q=1, in which case Δp,q is a metric according to Corollary 3.4, and still obtain values close to the ones given by the inframetric Δp, with the same p≥1.

For large values of p the behavior of Δp,q present the same disadvantages of Δp or of the standard Hausdorff distance. For example, in Table 1 it can be observed that all distances for p≥5 are useless because they imply that the distance from the discrete Pareto front P′ to the archive X1 is larger than its distance to the archive X2.

Figure 1 suggests that this is an undesirable outcome.

Tables 3 and 4 show that Δp,q is close to a metric when q≤−1 and p≥1. The percentage of the triangle inequality violations decreases as p increases or q decreases. Comparing both tables we can see, also, that this percentage decreases as the size of the sets increases.

Table 3 Percentage of the triangle inequality violations for different values of p and q. Here we randomly chose 80 sets, each one containing 2 points in [0,10]2, and verified the triangle inequality for all possible set permutations (that is 492960)

p q	1	2	5	10
−1	0.05396	0	0	0
−2	0.10265	0.00041	0	0
−5	0.28815	0.01217	0	0
−10	0.35622	0.05031	0.00041	0
−20	0.43046	0.08439	0.00446	0.00041

Table 4 The same as Table 3 but with sets containing 3 points in [0,10]2

p q	1	2	5	10
−1	0.00446	0	0	0
−2	0.02881	0	0	0
−5	0.15660	0.01379	0	0
−10	0.30388	0.05558	0.00609	0.00122
−20	0.40774	0.09250	0.01461	0.00609

4.2 Optimal Archives for Spherical Pareto Fronts

We now consider two standard Pareto sets: The convex and concave quarter-circle, see Figures 2, 3, 4, and 5.

P1={(cos⁡θ+1,sin⁡θ+1):−π≤θ≤−π2},P2={(cos⁡θ,sin⁡θ):0≤θ≤π2}. (4.1)

Fig. 2 Optimal Δ1,−1 archives A for the connected Pareto front P1 given by (4.1) with 2, 3, and 10 elements (blue circles). Each figure includes the respective archive coordinates and the Δ1,−1 distance

Fig. 3 Optimal Δ1,−1 archives A for the connected Pareto front P1 given by (4.1) with 20, 30, and 40 elements (blue circles). Each figure includes the respective Δ1,−1 distance

Fig. 4 Optimal Δ1,−1 archives A for the connected Pareto front P2 given by (4.1) with 2, 3, and 10 elements (blue circles). Each figure includes the respective archive coordinates and the Δ1,−1 distance

Fig. 5 Optimal Δ1,−1 archives A for the connected Pareto front P2 given by (4.1) with 20, 30, and 40 elements (blue circles). Each figure includes the respective Δ1,−1 distance

To numerically find the optimal Δp,q archive of size M, we discretized the Pareto front with 1000 equidistant points (which is an acceptable discretization according to [^{12, p. 603}]) and randomly choose an initial M-sized archive. Then we used a random-walk evolutionary algorithm moving one point at a time. Finally we refine the optimal archive with the “evenly spaced” construction suggested by [^{12, p. 607}].

When finding optimal Δp,q archives, our numerical experiments suggest a clear geometrical influence of the parameters p and q. When p≥−1 increases the optimal archive moves away from the Pareto set (see Figure 8). For values of p in (−∞,−1) the optimal archive sets are basically the same. When q∈[−1,1] increases the optimal archive tends to lose dispersion, converging to one point. When q≥1 the optimal archive collapses to one point and when q∈(−∞,−1] the corresponding optimal archives are basically the same (see Figure 6).

Fig. 6 Optimal Δ1,q five point set archives A for the connected Pareto front P1 given by (4.1) with p=1 and different values of q

Fig. 7 Numerical optimal Δ1,−1 archive A for the disconnect step Pareto front P3(5) given by (4.2) with 5, 10, and 20 elements. Each figure includes the respective Δ1,−1 distance

Fig. 8 Optimal Δp,−1 one point archives A for the connected Pareto front P1 given by (4.1) with q=−1 and different values of p. In all cases, the archives are located in the line x=y

4.3 Optimal Archives for Disconnected Pareto Sets

In this section we present the optimal Δp,q archives for a disconnected step Pareto front:

P3(s,γ)={(t,1−γt+(γ−1)⌊st⌋s):0≤t≤1}, (4.2)

where s is the number of steps, γ>0 is a small constant responsible for the step’s twist, and ⌊⋅⌋ stands for the integer part function.

Figure 7 show numerical optimal Δ1,−1 archives of sizes 5, 10, and 20, respectively. In each case (as in the previous section), the archive coordinates reveal that:

A∩P3(5,110)=∅,

i.e., the optimal archive points do not lie over the Pareto front but they are so close to it that this is hardly noticeable. It is also evident that the archives are evenly distributed along the Pareto front.

5 Conclusions

1. The indicator Δp,q generalizes the well-known averaged Hausdorff distance Δp (see (1.1) and (3.2)), it is still related with the standard Hausdorff distance (see (1.2)), and admits an expression in terms of the matrix ℓp,q-norm ‖DAB‖p,q (see (3.3)).

2. For arbitrary values of p,q∈ℝ¯, the indicator Δp,q is an inframetric on the space of finite subsets of ℝn, and when p,q∈[1,∞) it is a proper metric (see Corollary 3.4). With a proper metric the principle “the distance between two objects is the length of the shortest path joining them” is satisfied, thus, working with a metric has the advantage of avoiding unpleasant geometrical phenomena like the one shown in Figure 1.

3. For p,q∈ℝ the GDp,q and Δp,q-performance indicators are compliant with an optimality associated to the dominance relations derived from the conditions in Theorem 3.9.

4. The parameters p and q play geometrical roles in the Δp,q-optimal archive finding process, i.e., when p increases the optimal archive moves away from the Pareto set (see Figure 8) and when q increases, the optimal archive loses dispersion (see Figure 6). Thus the (p,q)-averaged distance can be calibrated to fulfill a large variety of optimization objectives.

5. Comparing our solutions with the optimal Δ1 archives shown in [¹³], we conclude that they are very close and the procedure to calculate Δp,q is no harder than the one used for Δp , both analytically or numerically.

6 Future Work

1. Suppose that p,q∈[1,∞) are fixed. Given a Pareto front and an arbitrary archive, it would be useful to establish a procedure to find a shortest or best “path” of configurations joining the given archive with the optimal one. In principle, this is possible when we are working with a proper metric.

2. Section 3 shows that the metricity of Δp,q is a consequence of the properties of power means for appropriate values of p and q. This indicates viable ways in which one would be able to modify or generalize this indicator preserving its behavior.

3. A deeper study of the Pareto-type compliance of the GDp,q and Δp,q-indicators is desirable to better assess their characteristics and possible drawbacks. This is also very important for applications.

4. The details of the extension of GDp,q and Δp,q to continuous sets and their properties are part of ongoing research and will appear in forthcoming publications (see [⁴]).

Acknowledgements

This work was partially supported by the project ID-PRY: 6736 of the Faculty of Sciences, Pontificia Universidad Javeriana, Bogotá, Colombia.

We would like to thank Prof. Oliver Schütze for introducing us to the subject of multiobjective optimization and for his warm friendship. We are also grateful with the referees, for the positive review and their useful comments and suggestions that helped to improve the presentation and results.

References

1 . Abyar, E., & Ghaemi, M. (2015). Hausdorff measure of noncompactness of matrix operators on some sequence spaces of a double sequential band matrix. J. Inequal. Appl., pp. 2015:406, 19. [ Links ]

2 . Aulbach, B., Rasmussen, M., & Siegmund, S. (2005). Approximation of attractors of nonautonomous dynamical systems. Discrete Contin. Dyn. Syst. Ser. B, Vol. 5, No. 2, pp. 215-238. [ Links ]

3 . Barnsley, M. F (1993). Fractals everywhere. Academic Press Professional, Boston, MA, second edition. Revised with the assistance of and with a foreword by Hawley Rising, III. [ Links ]

4 . Bogoya, J. M., Vargas, A., Cuate, O., & Schütze, O. (2018). A <math><mrow><mo stretchy='false'>(</mo><mi>p</mi><mo>,</mo><mi>q</mi><mo stretchy='false'>)</mo></mrow></math>-averaged Hausdorff distance for arbitrary measurable sets. Submitted. [ Links ]

5 . Bullen, P. S (2003). Handbook of means and their inequalities, volume 560 of Mathematics and its Applications. Kluwer Academic Publishers Group, Dordrecht. Revised from the 1988 original [P. S. Bullen, D. S. Mitrinović and P. M. Vasić, ıt Means and their inequalities, Reidel, Dordrecht; MR0947142]. [ Links ]

6 . Burago, D., Burago, Y., & Ivanov, S. (2001). A course in metric geometry, volume 33 of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI. [ Links ]

7 . Falconer, K (2003). Fractal geometry. John Wiley & Sons, Inc., Hoboken, NJ, second edition. Mathematical foundations and applications. [ Links ]

8 . Fraigniaud, P., Lebhar, E., & L., V. (2008). The inframetric model for the internet. IEEE INFOCOM 2008 - The 27th Conference on Computer Communications, pp. 1085-1093. [ Links ]

9 . Goldberg, M (1987). Equivalence constants for l_p norms of matrices. Linear and Multilinear Algebra, Vol. 21, No. 2, pp. 173-179. [ Links ]

10 . Hansen, M., & Jaszkiewicz, A. (1998). Evaluating the quality of approximations to the non-dominated set. IMM, Department of Mathematical Modelling, Technical University of Denmark. [ Links ]

11 . Pareto, V (1971). Manual of Political Economy. The Macmillan Press, London. [ Links ]

12 . Rudolph, G., Schütze, O., Grimme, C., Domínguez-Medina, C., & Trautmann, H. (2016). Optimal averaged Hausdorff archives for bi-objective problems: theoretical and numerical results. Comput. Optim. Appl., Vol. 64, pp. 589-618. [ Links ]

13 . Schütze, O., Esquivel, X., Lara, A., & Coello, C. (2012). Using the averaged Hausdorff distance as a performance measure in evolutionary multiobjective optimization. IEEE Trans. Evol. Comput., Vol. 16, No. 4, pp. 504-522. [ Links ]

14 . Siwel, J., Yew-Soon, O., Jie, Z., & Liang, F. (2014). Consistencies and contradictions of performance metrics in multiobjective optimization. IEEE Trans. Evol. Comput., Vol. 44, No. 12, pp. 2329-2404. [ Links ]

15 . Taylor, S (1964). The exact Hausdorff measure of the sample path for planar Brownian motion. Proc. Cambridge Philos. Soc., Vol. 60, pp. 253-258. [ Links ]

16 . Zhang, Q., & Li, H. (2007). MOEA/D: A multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput., Vol. 11, No. 6, pp. 712-731. [ Links ]

17 . Zitzler, E., & Thiele, L. (1999). Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach. IEEE Trans. Evol. Comput., Vol. 3, No. 4, pp. 257-271. [ Links ]

18 . Zitzler, E., Thiele, L., Laumanns, M., Fonseca, C. M., & Da Fonseca, V. G. (2003). Performance assessment of multiobjective optimizers: An analysis and review. IEEE Trans. Evol. Comput., Vol. 7, No. 2, pp. 117-132. [ Links ]

Received: February 01, 2017; Accepted: August 28, 2017

Corresponding author is Andrés Vargas. a.vargasd@javeriana.edu.co, jbogoya@javeriana.edu.co.

This is an open-access article distributed under the terms of the Creative Commons Attribution License