Published by
Nineye under
algorithm on
February 24, 2009
Hausdorff distance 알고리즘은 frechet distance 알고리즘의 이전에 pattern matching 알고리즘에 많이 이용된 알고리즘이다. 하지만 matching의 정확성에 대한 문제로 여러 가지 논란이 있었고, 현재는 일반적으로 Frechet distance 알고리즘이 Hausdorff distance 알고리즘의 대안이라고 많이 소개가 되고 있다.
Hausdorff distance 알고리즘은 기본적으로 edge가 아닌 vertex에 중점을 두고 있기 때문에 geometric pattern matching에서는 고려가 되고 있지 않은 알고리즘이지만, 성능이 빠르다는 큰 장점을 가지고 있기 때문에 아직도 많이 쓰이는 분야가 있다.
결국 수많은 Pattern matching 알고리즘의 특성들을 파악하여 두 알고리즘 중, 어떤 것을 선택하는지 결정해야 하고, 필자는 지도와 경로와의 관계에 대한 연구 개발을 했기 때문에 Frechet Distance 알고리즘을 사용하였다.
원문 : http://www-cgrl.cs.mcgill.ca/%7Egodfried/teaching/cg-projects/98/normand/main.html
Web project presented to Mr. Godfried Toussaint
for the course CS 507 Computational Geometry
McGill University, fall 1998
When talking about distances, we usually mean the shortest : for instance, if a point X is said to be at distance D of a polygon P, we generally assume that D is the distance from X to the nearest point of P. The same logic applies for polygons : if two polygons A and B are at some distance from each other, we commonly understand that distance as the shortest one between any point of A and any point of B. Formally, this is called a minimin function, because the distance D between A and B is given by :
(eq. 1)
This equation reads like a computer program : « for every point a of A, find its smallest distance to any point b of B ; finally, keep the smallest distance found among all points a ».
That definition of distance between polygons can become quite unsatisfactory for some applications ; let’s see for example fig. 1. We could say the triangles are close to each other considering their shortest distance, shown by their red vertices. However, we would naturally expect that a small distance between these polygons means that no point of one polygon is far from the other polygon. In this sense, the two polygons shown in fig. 1 are not so close, as their furthest points, shown in blue, could actually be very far away from the other polygon. Clearly, the shortest distance is totally independent of each polygonal shape.
| Figure 1 : |
The shortest distance doesn’t consider the whole shape. |
Another example is given by fig. 2, where we have the same two triangles at the same shortest distance than in fig. 1, but in different position. It’s quite obvious that the shortest distance concept carries very low informative content, as the distance value did not change from the previous case, while something did change with the objects.
| Figure 2 : |
The shortest distance doesn’t account for the position of the objects. |
As we’ll see in the next section, in spite of its apparent complexity, the Hausdorff distance does capture these subtleties, ignored by the shortest distance.
Named after Felix Hausdorff (1868-1942), Hausdorff distance is the « maximum distance of a set to the nearest point in the other set » [Rote91]. More formally, Hausdorff distance from set A to set B is a maximin function, defined as
(eq. 2)
where a and b are points of sets A and B respectively, and d(a, b) is any metric between these points ; for simplicity, we’ll take d(a, b) as the Euclidian distance between a and b. If for instance A and B are two sets of points, a brute force algorithm would be :
Brute force algorithm : 1. h = 0
2. for every point ai of A,
2.1 shortest = Inf ;
2.2 for every point bj of B
dij = d (ai , bj )
if dij < shortest then
shortest = dij
2.3 if shortest > h then
h = shortest

|
|
Figure 3 : Hausdorff distance on point sets.
This is illustrated in fig. 3 : just click on the arrow to see the basic steps of this computation. This algorithm obviously runs in O(n m) time, with n and m the number of points in each set.
It should be noted that Hausdorff distance is oriented (we could say asymmetric as well), which means that most of times h(A, B) is not equal to h(B, A). This general condition also holds for the example of fig. 3, as h(A, B) = d(a1, b1), while h(B, A) = d(b2, a1). This asymmetry is a property of maximin functions, while minimin functions are symmetric.
A more general definition of Hausdorff distance would be :
H (A, B) = max { h (A, B), h (B, A) } (eq.3)
which defines the Hausdorff distance between A and B, while eq. 2 applied to Hausdorff distance from A to B (also called directed Hausdorff distance). The two distances h(A, B) and h(B, A) are sometimes termed as forward and backward Hausdorff distances of A to B. Although the terminology is not stable yet among authors, eq. 3 is usually meant when talking about Hausdorff distance. Unless otherwise mentionned, from now on we will also refer to eq. 3 when saying “Hausdorff distance”.
If sets A and B are made of lines or polygons instead of single points, then H(A, B) applies to all defining points of these lines or polygons, and not only to their vertices. The brute force algorithm could no longer be used for computing Hausdorff distance between such sets, as they involve an infinite number of points.

So, what about the polygons of fig. 1 ? Remember, some of their points were close, but not all of them. Hausdorff distance gives an interesting measure of their mutual proximity, by indicating the maximal distance between any point of one polygon to the other polygon. Better than the shortest distance, which applied only to one point of each polygon, irrespective of all other points of the polygons.
| Figure 4 : |
Hausdorff distance shown around extremum of each triangles of fig. 1. Each circle has a radius of H( P1 , P2). |
The other concern was the insensitivity of the shortest distance to the position of the polygons. We saw that this distance doesn’t consider at all the disposition of the polygons. Here again, Hausdorff distance has the advantage of being sensitive to position, as shown in fig.5.
| Figure 5 : |
Hausdorff distance for the triangles of fig. 4 at the same shortest distance, but in different position. |
3.1 Assumptions
Throughout the rest of our discussion, we assume the following facts about polygons A and B :
- Polygons A and B are simple convex polygons ;
- Polygons A and B are disjoint from each other, that is :- they don’t intersect together ;
- no polygon contains the other.
3.2 Lemmas
The algorithm explained in the next section is based on three geometric observations, presented here. In order to simplify the text, we assume two points a and b that belong respectively to polygons A and B, such that :
d (a, b) = h (A, B)
In simple words, a is the furthest point of polygon A relative to polygon B, while b is the closest point of polygon B relative to polygon A.
| Lemma 1b : |
The perpendicular to ab at b is a supporting line of B, and a and B are on different sides relative to that line.
Proof of lemma 1b |
| Lemma 2 : |
There is a vertex x of A such that the distance from x to B is equal to h (A, B).
Proof of lemma 2 |
| Lemma 3 : |
Let bi be the closest point of B from a vertex a i of A. If µ is the moving direction (clockwise or counterclockwise) from bi to bi+1 then, for a complete cycle through all vertices of A, µ changes no more than twice.
Proof of lemma 3 |
3.3 Algorithm
The algorithm presented here was proposed by [Atallah83]. Its basic strategy is to compute successively h(A,B) and h(B, A) ; because of lemma 2, there is no need to query every point of the starting polygon, but only its vertices.

|
An important fact used by this algorithm is that a closest point can only be a vertex of the target polygon, or the foot z of a line perpendicular to one of its edges.
This fact suggests a function to check for the existence of a possible closest point. Given a source point a and a target edge defined by a point b1 and a vertex b2 : |
|
Function z = CheckForClosePoint (a, b1 , b2 ) :
Compute the position z where the line that passes through b1 and b2 crosses its perpendicular through a ;
if z is between b1 b2 then return z ;
else compute at b2 a line P perpendicular to the line ab2 ;
if P is a supporting line of B then return b2 ;
else return NULL.
That function obviously uses lemma 1b to decide whether or not the closest point of B might be located on the target edge, that should be close to a. It also supposes that the source point a and b2 are not located on different sides of the perpendicular to [b1b2 ] at b1, accordingly to lemma 3.
Now we are ready for the main algorithm ; the vertices of both polygons are presumed to be enumerated counterclockwise :
Algorithm for computing h(A, B) : 1. From a1, find the closest point b1 and compute d1 = d ( a1, b1 )
2. h(A, B) = d1
3. for each vertex ai of A,
3.1 if ai+1 is to the left of aibi
find bi+1 , scanning B counterclockwise with CheckForClosePoint from bi
if ai+1 is to the right of aibi
find bi+1 , scanning B clockwise with CheckForClosePoint from bi
if ai+1 is anywhere on aibi
bi+1 = bi
3.2 Compute di+1 = d (ai+1 , bi+1 )
3.3 h (A, B) = max { h (A, B), di+1 } |
3.4 Complexity
If polygons A and B respectively have n and m vertices, then :
- Step 1 can clearly be done in O(m) time ;
- Step 2 takes constant time O(1) ;
- Step 3 will be executed (n-1) times, that is O(n) ;
- Step 3.1 will not be executed in total more than O(2m). This is a consequence of lemma 3, which guarantees that polygon B can not be scanned more than twice ;
- Steps 3.2 and 3.3 are done in constant time O(1) ;
So the algorithm for computing h(A, B) takes :
O(m) + O(n) + O(2m) = O(n+m)
To find H(A, B), the algorithm needs to executed twice ; the total complexity for computing Hausdorff distance then stays linear to O(n+m).
3.5 Interactive applet
This applet illustrates the algorithm for computing h(A,B). You only need to draw two polygons, and then press the “step” or “run” button. Left click to define a new vertex, and close the polygon by clicking near the first vertex. Polygon A is the first one you draw, in green, while polygon B appears next, in red.
The algorithm was slightly modified to make it more appealing visually. Even if this algorithm is intended for two polygons totally separated from each other, it also works when B is inside A. However, it won’t work if A is inside of B, or when A and B are partially intersecting. You’re allowed anyway to try these cases to see what happens !
When defining your polygons, you will see a yellow area that indicates where you can add the next vertex, so the polygon keeps convex. The applet won’t let you define a non-convex polygon.
Please notice that the first time you draw the second half of a polygon, you will have to wait a few seconds until the Jama package loads.
| NB : | The first time you draw the second half of a polygon, you will have to wait a few seconds until the Jama package loads. |
One of the main application of the Hausdorff distance is image matching, used for instance in image analysis, visual navigation of robots, computer-assisted surgery, etc. Basically, the Hausdorff metric will serve to check if a template image is present in a test image ; the lower the distance value, the best the match. That method gives interesting results, even in presence of noise or occlusion (when the target is partially hidden).
Say the small image below is our template, and the large one is the test image :
We want to find if the small image is present, and where, in the large image. The first step is to extract the edges of both images, so to work with binary sets of points, lines or polygons :
Edge extraction is usually done with one of the many edge detectors known in image processing, such as Canny edge detector, Laplacian, Sobel, etc. After applying Rucklidge’s algorithm that minimizes Hausdorff distance between two images, the computer found a best match :
For this example, at least 50 % of the template points had to lie within 1 pixel of a test image point, and vice versa. Some scaling and skew were also allowed, to prevent rejection due to a different viewing angle of the template in the test image (these images and results come from Michael Leventon’s pages). Other algorithms might allow more complicated geometric transformations for registering the template on the test image.
In spite of my interest for the topic, an online demo is definitely beyond the scope of this Web project ! So here are some Web resources about image matching with Hausdorff distance :
Glossary
References
Published by
Nineye under
algorithm on
February 24, 2009
지도에 경로 선형을 매칭시켜서 경로를 복원하는 연구개발을 진행할 때, 여러 가지 알고리즘을 보고서, 가장 가능성 및 신뢰성이 높은 알고리즘으로 frechet distance 알고리즘을 선택했는다.
다음은 frechet distance 알고리즘과 관련된 여러 논문들을 읽고 이 알고리즘의 개념을 이해하는데 가장 큰 도움을 준 자료이다.
이 글을 읽고 좀 더 깊은 연구를 하려면 http://nineye.net/blog/archives/336 글에 첨부된 논문들을 참고한다.
원문 : http://cgm.cs.mcgill.ca/~athens/cs507/Projects/2002/StephanePelletier/#equ3
Computing the Frechet distance between two polygonal curves
By Stéphane Pelletier
Computational Geometry 308-507A
McGill University
Fall 2002
Note : This web page requires the plugin for Java 2 v1.4.1
Maurice Fréchet (1878-1973)
In object recognition applications, the objects of interest are often modelized in the system database by their shape, and thus, an important step of the recognition process consists generally to look for known shapes in an image. In many applications, two-dimensional shapes are given by the planar curves forming their boundaries. Consequently, a natural problem in shape comparison and recognition is to measure how much two given curves resemble
each other. Naturally, the first question to be answered is what distance measure between curves should be used to reflect the intuitive notion of resemblance.
One possibility consists in approximating each curve by a set of points and then using the Hausdorff distance, which is defined as follows:
|
|
(eq. 1)
|
where d is the underlying metric in the plane, for example the Euclidian distance, and A and B are the two sets of points describing the two curves to be compared. While the Hausdorff distance is an appropriate measure in many applications, the following figure shows an example where it is not. The two curves have a small Hausdorff distance, but do not look like each other at all.
Figure 1: Problem with Hausdorff distance
The reason of this discrepancy is that the Hausdorff distance only takes into account the sets of points on both curves and does not reflect the course of the curves. However, the course of curves is important in many applications, for example in handwriting recognition. In order to overcome this problem, one can use the Fréchet distance, that was first defined by Maurice Fréchet (1878-1973).
The Fréchet distance can be illustrated as follows : suppose a man is walking his dog and that he is constrained to walk on a curve and his dog on another curve. Both the man and the dog are allowed to control their speed independently, but are not allowed to go backwards. Then, the Fréchet distance of the curves is the minimal length of a leash that is necessary.
As performing mathematical operations on a curve of arbitrary shape turns out to be difficult in some cases, it is often useful to approximate a curve by a polygonal curve. A polygonal curve P:[0, N] is a continuous and piecewise linear curve made of N connected segments. Such a curve can be parametrized with a parameter a
such that P(a) refers to a given position on the curve, with P(0) refering to the first vertex of the polygonal curve and P(N) refering to its last vertex, as shown in the following figure.

Figure 2: Parametrization of a polygonal curve
Now, suppose that the man of the previous example is constrained to walk on a polygonal curve P of length N and that his dog has to walk on a polygonal curve Q of length M. The position of the man and the dog on their respective curve can be modelized as a function of t by using two continuous and increasing function α(t) and β(t), where α(0) = 0, α(1) = N, β(0) = 0 and β(1) = M. The position of the man as a function of time is given by P(α(t)) and the position of the dog by Q(β(t)). While many functions α(t) and β(t) can be used, some of them will cause the distance between the dog and the man to be large at some instant, whereas other functions will not. Finding the Fréchet distance actually corresponds to find two parametrization functions α(t) and β(t) that minimize the maximum distance between the man and his dog.
The following applet illustrates the effect of α(t) and β(t) on the maximum distance. Select one of the two examples, click on the run button and observe how the cursors move along the curves. You will notice that the maximum distance between the two cursors is not the same for both examples.
Mathematically, the Fréchet distance between two curves is defined as
|
|
(eq. 2)
|
where α(t) and β(t) range over continuous and increasing functions with α(0) = 0, α(1) = N, β(0) = 0 and β(1) = M only.
This equation reads like: for every possible function α(t) and β(t), find the largest distance between the man and its dog as they walk along their respective path; finally, keep the smallest distance found among these maximum distances.
For simplicity, we’ll take d(a,b) as the Euclidian distance between a and b.
In a first time, let’s consider an easier variant of the problem, namely the following decision problem:
Given: polygonal curves P and Q and some ε ≥ 0
Decide: whether δF(P,Q) ≤ ε
Throughout the rest of this tutorial, let P:[0,p] and Q:[0,q] be polygonal curves. Therefore, p and q are the number of edges of P and Q respectively. In order to solve the decision problem, let us first consider the case where p = q = 1, i.e. P and Q are just simple segments. We define
|
|
(eq. 3)
|
that describes all pairs of points, one on P, one on Q, whose distance is at most ε. The following figure shows line segments P, Q, a distance ε > 0, and Fε being the white area within the unit square; subsequently called the free space.

Figure 3: Free space of two segments
As the distance between the two green points on the curves is smaller than ε, the corresponding green point in the unit square belongs to the free space. Conversely, the blue point in the unit square does not belong to the free space, as the distance between the two corresponding blue points on the curves is larger then ε. Also, the free space corresponding to two line segments is the intersection of the unit square with an ellipse, possibly degenerated to the space between two parallel lines. The proof of this can be found in [1]. The important thing to remember here is that the free space of two line segments is convex.
We can extend equation 3 to arbitrary polygonal curves P, Q, of length p, q respectively:
|
|
(eq. 4)
|
The free space corresponding to the curves P and Q is the combination of the free spaces of all pairs containing one segment of P and one segment of Q. The following figure shows two polygonal curves and their corresponding free space for a given value of ε.
Figure 4: Free space of two polygonal curves
The algorithm is based on the following straightforward observation:
For polygonal curves P and Q we have δF(P, Q) ≤ ε, exactly if there exists a curve within the corresponding Fε from (0, 0) to (p, q) which is monotone in both coordinates.
In the man-dog example, the monotonicity condition corresponds to the fact that none of them is allowed to go backwards. Figure 3 shows such a curve proving that for that example the Fréchet distance is lower than ε.
Now, let Fε(i, j) with 1 ≤ i ≤ p, 1 ≤ j ≤ q refer to the free cell corresponding to the segment P(i – 1) P(i) and the segment Q(i – 1)Q(i). What we are interested in at this point is to find the intersection of the free space with the contour of each Fε(i, j). If the free space intersects the edge of a cell, it means that it is possible to enter in that cell through this edge. As the free space of each unit cell is convex, it is an easy matter to determine if there is a monotone curve that goes from (0,0) to (p, q) by computing all the intersections of the free space with the contour of each cell. The following figure illustrates the values that must be calculated in order to identify the passages between neighboring cells. LFi,j refers to the left line segment bounding Fε(i, j) and BFi,j refers to the bottom line segment.

Figure 5: Values that must be calculated
Furthermore, we define
(eq. 5)
The following figure shows the difference between Fε and R ε. One can see that all the zones that were not reachable by a monotone curve in Fε have been removed in Rε. Hereafter, Rε will be referred to as the monotone free space.

Figure 6: Monotone free space of two polygonal curves
Rε can easily be computed from Fε. We will use almost the same notation as in figure 5 above to designate the values that have to be calculated for the monotone free space. That is, LRi,j will refer to the left line segment bounding Rε(i, j) and BRi,j will refer to the bottom line segment.
1. for each feasible pair (i, j) do compute LFi, j and BFi, j
2. for i := 1 to p do determine BRi, 1
3. for i := 1 to q do determine LR1, j
4. for i := 1 to p do
for j := 1 to q do
construct LRi+1, j and BRi, j+1 from LRi, j, BRi, j, LFi+1, j, BRi, j+1
5. answer yes
if (p, q)
LRp+1, q no
otherwise.
The values of figure 5 can be calculated in constant time for each free space cell by using any algorithm that computes segment-circle intersections. Therefore, if polygonal curves P and Q respectively have p and q segments, then the previous algorithm decides in O(pq) time, whether δ F(P,Q) ≤ ε.
Now, let us consider the problem of really computing the Fréchet distance. Assume that we start with ε = 0 and continuously increase ε. Then the free space becomes larger and larger and we want to determine the smallest ε such that it contains a monotone curve from (0,0) to (p,q). Observe that this can occur only in one of the following cases:
a) ε is minimal with (0,0)
Fε and (p,q)
Fε
b) ε is minimal with LFi, j or BFi,j becomes nonempty for some pair (i,j).
c) ε is minimal with ai,j = bk,j or ci,j = di,k for some i,j,k
Clearly, case a) states that δF(P,Q) must be greater or egal to the distance between the starting points of P and Q and also greater or egal to the distance between their ending points. Case b) corresponds to the opening of a new passage between two neighboring cells and case c) corresponds to the opening of a new vertical or horizontal passage within the diagram. The following figure illustrates case c) for a horizontal passage.

Figure 7: Horizontal passage that opens
The following figure shows the geometric meaning of ε in cases a), b) and c).

Figure 8: Geometric meaning of critical values
There are O(p2q + pq2) such critical values of ε, namely the distance between starting points and endpoints of P and Q (case a), the distances between vertices of one curve and edges of the other (case b), and the common distance of two vertices of one curve to the intersection point of their bisector with some edge of the other (case c). Each one of these values can be computed in O(1) time. So, we obtain the following algorithm for computing δF(P,Q).
We observe that the runtime is majorized by the one of step 2, which is O((p2q + q2p)log(pq)).
The following applet illustrates the algorithm. For simplicity reasons, the implemented algorithm doesn’t calculate the critical values of ε that correspond to case c).
- H. Alt and M. Godau, Computing the Fréchet distance between two polygonal curves, Internat. J. Comput. Geom. Appl., 5:75-91, 1995
- H. Alt, C. Knauer and C. Wenk, Matching polygonal curves with respect to the Fréchet distance”, Proc. 18th Int. Symp. Theoretical Aspect of Computer Science (STACS):63-74, 2001, Dresden, Germany
- H. Alt, C. Knauer and C. Wenk, Bounding the Fréchet distance by the Hausdorff distance”, Proc. 17th European Workshop on Computational Geometry., 166-169, 2001, Berlin, Germany