1 Introduction
Computer vision is a process of information processing that allows the machine to understand the reality by analyzing and interpreting the information.
It is a branch of artificial intelligence whose goal is to allow a machine understands what it “sees” when it is connected to one or more cameras. Stereo matching is one of the most difficult problems in computer vision. It is a fundamental low-level vision task for many applications such as autonomous vehicle navigation, objet manipulation, robot vision and biometrics [1, 11, 6]. Indeed, stereo matching is the fundamental precursor of 3D reconstruction from a stereo pair or sequence of images.
It consists to find the corresponding points in two or more images that correspond to the same physical entity in the scene. The result of image matching is the disparity map; this map contains the spatial shift for each pixel, but, it does not allow the knowledge of the 3d structure of the observed scene [7].
Local methods are based on an analysis of the neighborhood of the points to be matched. Several methods have been developed in this context, the most used are: SAD, SSD, ZNCC, etc. These methods are very effective when it comes to doing 3D reconstruction in images with low motion and a small variation of the camera’s intrinsic parameters.
The advantage of these methods lies in the fact that they require few resources and they generate images with a dense disparity. But, they have a high rate of error especially in the occlusion zones and in the texture zones.
Other types of approaches are used for matching, these approaches are grouped together as global methods. The use of this type of method always leads us to a problem of minimizing an energy function defined over the entire image.
In matching, one is placed in the case of a combinatorial optimization. In this paper, we demonstrate the performances of several stereo matching algorithms: correlation techniques, subpixel estimation, dynamic programming and Hierarchical matching.
The experimental results is conducted on image pair obtained from Middlebury data base with different noise level. We consider the mean absolute error, mean relative error, the percentage of correct matches and wrong matches, computational time, as the performance measures.
Finally, we want to assess the impact of noise on these algorithms as well as algorithms parameters such as the correlation window size and the correlation function. The rest of paper is organized as follows:
In the next section, an overview of the different stereo matching algorithms is given. In section 3, we present the evaluation criteria. In section 4, we discuss the results obtained by the different approaches and finally we conclude by some perspectives.
2 Matching Algorithms
In general, the matching consists in finding in the left and right images, the homologous primitives, that is to say, the primitives that are the projection of the same entity of the scene. There are two types of matching: Dense matching: for each point of an image we calculate its correspondent.
Scattered matching: the correspondent is calculated only for points describing particular properties (the points of contours, corner, etc ...). The results of the mapping are visually represented by an image called the disparity map.
Each pixel of the disparity map represents the distance between the pixel position of the left image and that of its corresponding in the right image. In this study, we consider the following stereo matching algorithms:
2.1 Dynamic Programming
Introduced by Bellman and Dreyfus, dynamic programming has an important role in optimization problem. In computer vision, the Viterbi algorithm is used to estimate the disparity map between two images. The purpose of this technique is to find the optimal path from one side of the image to the other by using the block matching metric as the cost function [13, 8]. Many researchers have been interested by improve the dynamic programming algorithm and defined an iterated algorithm [4], also, parallel algorithm [9, 2].
2.2 Hierarchical Matching
Hierarchical Matching is one of the most used algorithms in stereo vision [14]. The basic idea of Hierarchical matching is the construction of pyramid of images.
Therefore, we reduce the pair of images: the top of the pyramid corresponds to the lower resolution and the base of the image corresponds to the original of image.
For example: if the original image is 2048×2048 pixels, we reduce the resolution to 16×16. In this case, the pyramid contains eight levels. To move from one level to other, we reduce the image by averaging the pixel values in the square
2.3 Correlation Methods
The correlation techniques calculate the resemblance between two images. We consider a correlation window around the matched pixel. The correlation window is compared with a similar window in the right image at all along the epipolar line. Each movement of the window in the right image, a correlation index is calculated. The movement that minimizes the correlation index is retained as disparity. The correlation methods can be classified into two categories: Cross-Correlation and Distances measures. The values obtained by the NCC are in
ZNCC is very used in the literature and it corresponds to the classical linear correlation in statistics [5]. The distances measures are also very used. The disparity values are in
3 Evaluation Criteria
In this study, we propose to use four performance measures to evaluate the disparity provided by the algorithms. The first one is the mean absolute error which is the mean of the absolute error between the theoretical disparity map and the computed disparity map. The mean absolute error is defined as follows:
where
Moreover, we calculate the percentage of correct matches:
And the percentage of wrong matches:
where
4 Experimental Results
In this section, the performance evaluation of each stereo matching algorithm is presented. We conducted the experimentation in terms of:
– Mean absolute error,
– Mean relative error,
– Percentage of correct matches,
– Percentage of wrong matches,
– Computational time.
We consider other important evaluation criteria which is the visual aspect of disparity map. The different disparity estimate algorithms are implemented on a PC with an Intel i32.13 Ghz, 4 GB RAM by using the Matlab and we use the computer vision toolbox for the 3D reconstruction.
The experimentation is conducted on the Middlebury image called “teddy” with a size of
We also consider the SAD for the dynamic programming and hierarchical matching. Figure 1 shows the left and the right images with the theoretical disparity map. The aim of this study is to evaluate the performance of matching algorithms and show their efficiency on different noise level.

Fig. 1 The figure shows the left image (a), the right image (b) and the referenced disparity map (c)
We use the gaussian noise with different variance :

Fig. 2 The figure shows the left image (a), (b) and (c) with

Fig. 3 The visual result of stereo matching. Disparity maps computed with different matching algorithms
Table 1 Some correlation methods
Methods | Formulation |
Cross Correlation | |
NCC | |
ZNCC | |
Distance and Locally Centered Distance | |
SSD | |
SAD | |
ZSSD | |
ZSAD | |
LSSD | |
LSAD |
Table 2 The results obtained by the correlation methods. (D. P. is the Dynamic Programming and H. M. is the Hierarchical Matching)
Methods | MAE | MRE | PCM | PWM |
SAD | 1,79 | 0,015 | 96,69 | 3,31 |
ZSAD | 0,95 | 0,010 | 98,28 | 1,72 |
LSAD | 1,03 | 0,011 | 98,11 | 1,89 |
SSD | 1,58 | 0,013 | 97,06 | 2,94 |
ZSSD | 1,16 | 0,011 | 98,06 | 1,94 |
LSSD | 1,26 | 0,012 | 97,85 | 2,15 |
NCC | 1,24 | 0,012 | 97,87 | 2,13 |
ZNCC | 0,91 | 0,009 | 98,30 | 1,70 |
SubPixel | 3,69 | 0,031 | 93,56 | 6,44 |
D. P. | 2,61 | 0,021 | 95,56 | 4,44 |
H. M. | 8,67 | 0,047 | 83,82 | 16,16 |
We remark that ZNCC technique provide a good results with images without noise but, it is not very perfect with noise images. The high percentage of correct matches is also recorded for the ZSAD, ZSSD with an advantage of ZSAD
There is no value for hierarchical matching, ZNCC and pynamic programming for
The same remark is observed in figure 6 and 7. We cleary show that the percentage of correct matches is higher in each noise levels when we use the dynamic programming algorthim.
Generally speaking, the dymanic programming algorithm provide a very good results with noise images. It is very interesting to evaluation the efficiency of these matching algorithms in term of computational time.
Figure 8 show the computational time of each methods versus the percentage of noise. The lower computational time is recorded for the subpixel and hierarchical matching. The slow method is ZNCC and NCC. To show the efficiency of the algorithms, we plot the percentage of correct matches versus the computational time.
The most efficient algorithm is the one with small computational time and with a high percentage of correct matches. In other words, the algorithm which appear further to the top left of the figure is the better one.
Finally, it is noted that the ZNCC gives better results when used with no noisy image, whereas, the dynamic programming provides better results when used a noisy image.
This is why, we choose dynamic programming algorithm for
We consider the “hallway” images of computer vision matlab toolbox with know intrinsics matrix:
Focal_length_x | = | 409,4433 |
Focal_length_y | = | 416,0865 |
Camera_center_x | = | 204,1225 |
Camera_center_y | = | 146,4133 |
Following figures represent the left image and the right image used in the 3D reconstructionfn.
5 Conclusion
In this paper, we analyzed and evaluated the performance of some stereo matching algorithms. We proposed a comparison protocol for the efficient and comparing their efficiency and performances in term of: absolute and relative errors, accuracy of matches by calculating the percentage of correct and wrong matches.
Also, we consider the computational time and the robustness of each algorithm to noise. The experimentation has been conducted by using the teddy images obtained from middle bury data set. We used different percentage of Gaussian noise applied to the left and right images and we calculate the disparity maps. Finally, we proposed a 3D reconstruction using the better disparity map obtained from the methods.
The results has shown that the ZNCC has given a very satisfactory results when it was used with images without noise. Dynamic programming has provided a good results, it is a very robust against noisy images. In term of computational time, the low time was recorded for subpixel estimation and hierarchical methods.