Be lost in thought…

Nineye’s personal weblog!!!

Archive for the ‘algorithm’ Category

Aho-Corasick algorithm

Published by Nineye under algorithm on February 24, 2009

Aho-Corasick Algorithm

정의

Aho-Corasick algorithm은 Alfred V. Aho and Margaret J. Corasick에 의해 발견된 문자열 검색 알고리즘이다. 이것은 입력 텍스트에서 문자열의 유한 집합의 원소들을 찾는 사전 매칭 알고리즘의 한 종류이다. 또한 이것은 모든 패턴을 한번만 매칭시켜서 이 알고리즘의 복잡도는 패턴의 크기 더하기 찾을 문자열의 크기 더하기 매칭된 문자열의 개수에 대한 선형시간이다.
Informally, the algorithm constructs a trie with suffix tree-like set of links from nodes representing a string (e.g. abc to the node corresponding the nodes corresponding to the longest proper suffix nodes (e.g. bc if it exists, else c if that exists, else the root). It also contains links from nodes to the longest suffix node that corresponds to a dictionary entry; thus all of the matches may be enumerated by following the resulting linked list. It then uses the trie at runtime, moving along the input and keeping the longest match, using the suffix links to make sure that computation is linear. For every node that is in the dictionary and every link along the dictionary suffix linked list, an output is generated.
When the pattern dictionary is known in advance (e.g. a computer virus database), the construction of the automaton can be performed once off-line and the compiled automaton stored for later use. In this case, its run time is linear in the length of the input plus the number of matched entries.
The Aho-Corasick algorithm formed the basis of the original Unix command fgrep.
The following is the Aho-Corasick data structure constructed from the specified dictionary, with each row in the table representing a node in the trie, with the column path indicating the (unique) sequence of characters from the root to the node.

용어 정의

The Trie

  • A를 유한 알파벳으로 보자. A 트리는 다음과 같은 특징을 가지는 root 트리이다.
    • 트리의 간선은 라벨이라 불리는 A의 문자로 마크된다.
    • 각각의 트리의 노드 k와 각각의 A의 문자 c에 대해, 적어도 하나의 k에서 출발하여 c로 마크되는 간선이 있다.
  • 트리의 주어진 노드 k에서, path(k)는 root로부터 k까지 라벨에 의해 만들어진 문자열을 표시한다.
  • P를 문자열의 집합으로 보자. 그러면 trie(P)는 다음의 특징을 가지는 노드의 숫자상으로 가장 작은 트리이다.
    • For all pat of P exists a node k< such that pat = path(k)

Example: P = {ab, ba, babb, bb}

The Aho/Corasick String Matching Automaton

  • 주어진 패턴 유한집합 P를 위한 Aho/Corasick 오토마타 문자열 매칭은 접미사로서 P의 단어를 포함하는 모든 단어들의 집합을 받아들이는 결정적 유한 오토마타 G이다.

An Aho/Corasick String Matchin Automaton for a given finite set P of patterns is a (deterministic) finite automaton G accepting the set of all words containing a word of P as a suffix.

  • G는 다음의 성분으로 구성된다.
    1. 상태의 유한집합 Q(inite set Q of states)
    2. 유한알파벳 A(finite alphabeth A)
    3. transition function g: Q × A -> Q + {fail}
    4. failure function h: Q -> Q + {fail}
    5. initial state q0 in Q
    6. a set F of final states

Example: P = {ab, ba, babb, bb}

The transition function

  • g: Q × A -> Q + {fail}를 결정적 유한 오토마타 (A, Q, init, g, T)의 transition function으로 표시하도록 두자. 여기서 A는 입력 알파벳이고, Q는 유한상태집합이고, init은 초기상태이고, T는 종료상태이다.

Let g: Q × A -> Q + {fail} denote the transition function of an deterministic finite automaton (A, Q, init, g, T), where A is an input alphabeth, Q is a finite set of states, init is the initial state and T is the set of terminal states.

  • g(q,a)의 값은 입력심볼 a에 의해 라벨붙여진 transition에 의해 상태 q로부터 온 상태이다. 우리는 또한 문자 w에 대해서 만약 존재한다면 g(q,w)가 표시한다고 말할 수 있다.

The value g(q,a) is the state reached from state q by the transition labeled by the input symbol a. We can also say that for the word w, g(q,w) denotes, if it exists, the state reached after reading the word w in the automaton from the state q.

  • 경제적인 이유로 우리는 g가 부분함수가 되는 것을 허용한다.

For reasons of economy we allow g to be a partial function.

The failure function

  • Q를 Aho/Corasick 오토마타의 상태집합으로 보자.

Let Q be the set of states of an Aho/Corasick automaton.

  • h: Q -> Q + {fail} 는 failure function을 나타낸다.

Let h: Q -> Q + {fail} denote the failure function.

  • q를 Q의 상태들이 되는 q’로 두자.

Let q, q’ be states of Q.

  • Aho/Corasick automaton의 failure function은 다음과 같이 정의된다

The failure function of an Aho/Corasick automaton is defined as follows:

h(q) = q’ iff among the states of Q, q’ delivers the longest true suffix of path(q)
알고리즘
Algorithm: Trie Matching

/***************************/
/* computation of the trie */
/***************************/

/* m[p] = lenght of pat[p] */
/* #pat = number of patterns */
/* g(Node, Character) is the transition function */

final = empty set
FOR p = 1 TO #pat
q = root
FOR j = 1 TO m[p]
IF g(q, pat[p][j]) == null
insert(q, pat[p][j])
ENDIF
q = g(q, pat[p][j])
ENDFOR
final = union(final, q)
ENDFOR

/***************************************/
/* computation of the failure function */
/***************************************/

/* h(Node) is the failure function */

h(root) = fail
FORALL {q’ | parent(q’) == root}
h(q’) = root
ENDFOR

FORALL {q’ | depth(q’)>1} in BFS order
q = parent(q’)
c = the symbol with g(q,c) == q’
r = h(q)
WHILE r != fail AND g(r,c) == fail
r = h(r)
ENDWHILE
IF r == fail
h(q’) = g(root,c)
ELSE
h(q’) = g(r,c)
IF isElement(h(q’), final)
final = union(final, q’)
ENDIF
ENDIF
ENDFOR

/****************/
/* Aho/Corasick */
/****************/

/* n = length of text */

q = root
FOR i = 1 TO n
WHILE q != fail AND g(q, text[i]) == fail
q = h(q)
ENDWHILE
IF q == fail
q = root
ELSE
q = g(q, text[i])
ENDIF
IF isElement(q, final)
RETURN TRUE
ENDIF
ENDFOR
RETURN FALSE

complexity

Time complexity: O (m + n)
Space complexity: O (m)

참조

Leave a Reply

Hausdorff distance between convex polygons

Published by Nineye under algorithm on February 24, 2009

Hausdorff distance 알고리즘은 frechet distance 알고리즘의 이전에 pattern matching 알고리즘에 많이 이용된 알고리즘이다. 하지만 matching의 정확성에 대한 문제로 여러 가지 논란이 있었고, 현재는 일반적으로 Frechet distance 알고리즘이 Hausdorff distance 알고리즘의 대안이라고 많이 소개가 되고 있다.

Hausdorff distance 알고리즘은 기본적으로 edge가 아닌 vertex에 중점을 두고 있기 때문에 geometric pattern matching에서는 고려가 되고 있지 않은 알고리즘이지만, 성능이 빠르다는 큰 장점을 가지고 있기 때문에 아직도 많이 쓰이는 분야가 있다.

결국 수많은 Pattern matching 알고리즘의 특성들을 파악하여 두 알고리즘 중, 어떤 것을 선택하는지 결정해야 하고, 필자는 지도와 경로와의 관계에 대한 연구 개발을 했기 때문에 Frechet Distance 알고리즘을 사용하였다.

원문 : http://www-cgrl.cs.mcgill.ca/%7Egodfried/teaching/cg-projects/98/normand/main.html

16
Normand Grégoire and Mikael Bouillot
1. Introduction
2. What is Hausdorff distance ?
3. Computing Hausdorff distance
3.1 Assumptions
3.2 Lemmas
3.3 Algorithm
3.4 Complexity
3.5 Interactive applet
4. Application examplesGlossary
References

22 Web project presented to Mr. Godfried Toussaint
for the course CS 507 Computational Geometry
McGill University, fall 1998

1. Introduction

When talking about distances, we usually mean the shortest : for instance, if a point X is said to be at distance D of a polygon P, we generally assume that D is the distance from X to the nearest point of P. The same logic applies for polygons : if two polygons A and B are at some distance from each other, we commonly understand that distance as the shortest one between any point of A and any point of B. Formally, this is called a minimin function, because the distance D between A and B is given by :

32 (eq. 1)

This equation reads like a computer program : « for every point a of A, find its smallest distance to any point b of B ; finally, keep the smallest distance found among all points a ».

That definition of distance between polygons can become quite unsatisfactory for some applications ; let’s see for example fig. 1. We could say the triangles are close to each other considering their shortest distance, shown by their red vertices. However, we would naturally expect that a small distance between these polygons means that no point of one polygon is far from the other polygon. In this sense, the two polygons shown in fig. 1 are not so close, as their furthest points, shown in blue, could actually be very far away from the other polygon. Clearly, the shortest distance is totally independent of each polygonal shape.

42

Figure 1 : The shortest distance doesn’t consider the whole shape.

Another example is given by fig. 2, where we have the same two triangles at the same shortest distance than in fig. 1, but in different position. It’s quite obvious that the shortest distance concept carries very low informative content, as the distance value did not change from the previous case, while something did change with the objects.

52

Figure 2 : The shortest distance doesn’t account for the position of the objects.

As we’ll see in the next section, in spite of its apparent complexity, the Hausdorff distance does capture these subtleties, ignored by the shortest distance.

2. What is Hausdorff distance ?

Named after Felix Hausdorff (1868-1942), Hausdorff distance is the « maximum distance of a set to the nearest point in the other set » [Rote91]. More formally, Hausdorff distance from set A to set B is a maximin function, defined as

62 (eq. 2)

where a and b are points of sets A and B respectively, and d(a, b) is any metric between these points ; for simplicity, we’ll take d(a, b) as the Euclidian distance between a and b. If for instance A and B are two sets of points, a brute force algorithm would be :

Brute force algorithm : 1. h = 0
2. for every point ai of A,
2.1 shortest = Inf ;
2.2 for every point bj of B
dij = d (ai , bj )
if dij < shortest then
shortest = dij
2.3 if shortest > h then
h = shortest
72

Figure 3 : Hausdorff distance on point sets.

 

This is illustrated in fig. 3 : just click on the arrow to see the basic steps of this computation. This algorithm obviously runs in O(n m) time, with n and m the number of points in each set.

It should be noted that Hausdorff distance is oriented (we could say asymmetric as well), which means that most of times h(A, B) is not equal to h(B, A). This general condition also holds for the example of fig. 3, as h(A, B) = d(a1, b1), while h(B, A) = d(b2, a1). This asymmetry is a property of maximin functions, while minimin functions are symmetric.

A more general definition of Hausdorff distance would be :

H (A, B) = max { h (A, B), h (B, A) } (eq.3)

which defines the Hausdorff distance between A and B, while eq. 2 applied to Hausdorff distance from A to B (also called directed Hausdorff distance). The two distances h(A, B) and h(B, A) are sometimes termed as forward and backward Hausdorff distances of A to B. Although the terminology is not stable yet among authors, eq. 3 is usually meant when talking about Hausdorff distance. Unless otherwise mentionned, from now on we will also refer to eq. 3 when saying “Hausdorff distance”.

If sets A and B are made of lines or polygons instead of single points, then H(A, B) applies to all defining points of these lines or polygons, and not only to their vertices. The brute force algorithm could no longer be used for computing Hausdorff distance between such sets, as they involve an infinite number of points.

82

So, what about the polygons of fig. 1 ? Remember, some of their points were close, but not all of them. Hausdorff distance gives an interesting measure of their mutual proximity, by indicating the maximal distance between any point of one polygon to the other polygon. Better than the shortest distance, which applied only to one point of each polygon, irrespective of all other points of the polygons.

92

Figure 4 : Hausdorff distance shown around extremum of each triangles of fig. 1. Each circle has a radius of H( P1 , P2).

The other concern was the insensitivity of the shortest distance to the position of the polygons. We saw that this distance doesn’t consider at all the disposition of the polygons. Here again, Hausdorff distance has the advantage of being sensitive to position, as shown in fig.5.

102

Figure 5 : Hausdorff distance for the triangles of fig. 4 at the same shortest distance, but in different position.

3. Computing Hausdorff distance between convex polygons

3.1 Assumptions

Throughout the rest of our discussion, we assume the following facts about polygons A and B :

  • Polygons A and B are simple convex polygons ;
  • Polygons A and B are disjoint from each other, that is :- they don’t intersect together ;
    - no polygon contains the other.

3.2 Lemmas

The algorithm explained in the next section is based on three geometric observations, presented here. In order to simplify the text, we assume two points a and b that belong respectively to polygons A and B, such that :

d (a, b) = h (A, B)

In simple words, a is the furthest point of polygon A relative to polygon B, while b is the closest point of polygon B relative to polygon A.

Lemma 1a :

The perpendicular to ab at a is a supporting line of A, and A is on the same side as B relative to that line.

Proof of lemma 1a

Lemma 1b :

The perpendicular to ab at b is a supporting line of B, and a and B are on different sides relative to that line.

Proof of lemma 1b

Lemma 2 :

There is a vertex x of A such that the distance from x to B is equal to h (A, B).

Proof of lemma 2

Lemma 3 :

Let bi be the closest point of B from a vertex a i of A. If µ is the moving direction (clockwise or counterclockwise) from bi to bi+1 then, for a complete cycle through all vertices of A, µ changes no more than twice.

Proof of lemma 3

3.3 Algorithm

The algorithm presented here was proposed by [Atallah83]. Its basic strategy is to compute successively h(A,B) and h(B, A) ; because of lemma 2, there is no need to query every point of the starting polygon, but only its vertices.

111

An important fact used by this algorithm is that a closest point can only be a vertex of the target polygon, or the foot z of a line perpendicular to one of its edges.

This fact suggests a function to check for the existence of a possible closest point. Given a source point a and a target edge defined by a point b1 and a vertex b2 :


Function z = CheckForClosePoint (a, b1 , b2 ) :

Compute the position z where the line that passes through b1 and b2 crosses its perpendicular through a ;
if z is between b1 b2 then return z ;
else compute at b2 a line P perpendicular to the line ab2 ;
if P is a supporting line of B then return b2 ;
else return NULL.

That function obviously uses lemma 1b to decide whether or not the closest point of B might be located on the target edge, that should be close to a. It also supposes that the source point a and b2 are not located on different sides of the perpendicular to [b1b2 ] at b1, accordingly to lemma 3.

Now we are ready for the main algorithm ; the vertices of both polygons are presumed to be enumerated counterclockwise :

Algorithm for computing h(A, B) : 1. From a1, find the closest point b1 and compute d1 = d ( a1, b1 )
2. h(A, B) = d1
3. for each vertex ai of A,
3.1 if ai+1 is to the left of aibi
find bi+1 , scanning B counterclockwise with CheckForClosePoint from bi
if ai+1 is to the right of aibi
find bi+1 , scanning B clockwise with CheckForClosePoint from bi
if ai+1 is anywhere on aibi
bi+1 = bi
3.2 Compute di+1 = d (ai+1 , bi+1 )
3.3 h (A, B) = max { h (A, B), di+1 }

3.4 Complexity

If polygons A and B respectively have n and m vertices, then :

  • Step 1 can clearly be done in O(m) time ;
  • Step 2 takes constant time O(1) ;
  • Step 3 will be executed (n-1) times, that is O(n) ;
  • Step 3.1 will not be executed in total more than O(2m). This is a consequence of lemma 3, which guarantees that polygon B can not be scanned more than twice ;
  • Steps 3.2 and 3.3 are done in constant time O(1) ;

So the algorithm for computing h(A, B) takes :

O(m) + O(n) + O(2m) = O(n+m)

To find H(A, B), the algorithm needs to executed twice ; the total complexity for computing Hausdorff distance then stays linear to O(n+m).

3.5 Interactive applet

This applet illustrates the algorithm for computing h(A,B). You only need to draw two polygons, and then press the “step” or “run” button. Left click to define a new vertex, and close the polygon by clicking near the first vertex. Polygon A is the first one you draw, in green, while polygon B appears next, in red.

The algorithm was slightly modified to make it more appealing visually. Even if this algorithm is intended for two polygons totally separated from each other, it also works when B is inside A. However, it won’t work if A is inside of B, or when A and B are partially intersecting. You’re allowed anyway to try these cases to see what happens !

When defining your polygons, you will see a yellow area that indicates where you can add the next vertex, so the polygon keeps convex. The applet won’t let you define a non-convex polygon.

Please notice that the first time you draw the second half of a polygon, you will have to wait a few seconds until the Jama package loads.
 

NB : The first time you draw the second half of a polygon, you will have to wait a few seconds until the Jama package loads.

4. Application examples

One of the main application of the Hausdorff distance is image matching, used for instance in image analysis, visual navigation of robots, computer-assisted surgery, etc. Basically, the Hausdorff metric will serve to check if a template image is present in a test image ; the lower the distance value, the best the match. That method gives interesting results, even in presence of noise or occlusion (when the target is partially hidden).

Say the small image below is our template, and the large one is the test image :

153 123

We want to find if the small image is present, and where, in the large image. The first step is to extract the edges of both images, so to work with binary sets of points, lines or polygons :

161 132

Edge extraction is usually done with one of the many edge detectors known in image processing, such as Canny edge detector, Laplacian, Sobel, etc. After applying Rucklidge’s algorithm that minimizes Hausdorff distance between two images, the computer found a best match :

142

For this example, at least 50 % of the template points had to lie within 1 pixel of a test image point, and vice versa. Some scaling and skew were also allowed, to prevent rejection due to a different viewing angle of the template in the test image (these images and results come from Michael Leventon’s pages). Other algorithms might allow more complicated geometric transformations for registering the template on the test image.

In spite of my interest for the topic, an online demo is definitely beyond the scope of this Web project ! So here are some Web resources about image matching with Hausdorff distance :


Glossary

 

References

 

Leave a Reply

Frechet Distance algorithm

Published by Nineye under algorithm on February 24, 2009

지도에 경로 선형을 매칭시켜서 경로를 복원하는 연구개발을 진행할 때, 여러 가지 알고리즘을 보고서, 가장 가능성 및 신뢰성이 높은 알고리즘으로 frechet distance 알고리즘을 선택했는다.

다음은 frechet distance 알고리즘과 관련된 여러 논문들을 읽고 이 알고리즘의 개념을 이해하는데 가장 큰 도움을 준 자료이다.

이 글을 읽고 좀 더 깊은 연구를 하려면 http://nineye.net/blog/archives/336 글에 첨부된 논문들을 참고한다.

원문 : http://cgm.cs.mcgill.ca/~athens/cs507/Projects/2002/StephanePelletier/#equ3

Computing the Frechet distance between two polygonal curves

By Stéphane Pelletier

Computational Geometry 308-507A
McGill University
Fall 2002

Note : This web page requires the plugin for Java 2 v1.4.1


113
Maurice Fréchet (1878-1973)


1. Introduction

In object recognition applications, the objects of interest are often modelized in the system database by their shape, and thus, an important step of the recognition process consists generally to look for known shapes in an image. In many applications, two-dimensional shapes are given by the planar curves forming their boundaries. Consequently, a natural problem in shape comparison and recognition is to measure how much two given curves resemble each other. Naturally, the first question to be answered is what distance measure between curves should be used to reflect the intuitive notion of resemblance.

One possibility consists in approximating each curve by a set of points and then using the Hausdorff distance, which is defined as follows:

21

(eq. 1)

where d is the underlying metric in the plane, for example the Euclidian distance, and A and B are the two sets of points describing the two curves to be compared. While the Hausdorff distance is an appropriate measure in many applications, the following figure shows an example where it is not. The two curves have a small Hausdorff distance, but do not look like each other at all.

31

Figure 1: Problem with Hausdorff distance

The reason of this discrepancy is that the Hausdorff distance only takes into account the sets of points on both curves and does not reflect the course of the curves. However, the course of curves is important in many applications, for example in handwriting recognition. In order to overcome this problem, one can use the Fréchet distance, that was first defined by Maurice Fréchet (1878-1973).


2. What is Fréchet distance ?

The Fréchet distance can be illustrated as follows : suppose a man is walking his dog and that he is constrained to walk on a curve and his dog on another curve. Both the man and the dog are allowed to control their speed independently, but are not allowed to go backwards. Then, the Fréchet distance of the curves is the minimal length of a leash that is necessary.

As performing mathematical operations on a curve of arbitrary shape turns out to be difficult in some cases, it is often useful to approximate a curve by a polygonal curve. A polygonal curve P:[0, N] is a continuous and piecewise linear curve made of N connected segments. Such a curve can be parametrized with a parameter a such that P(a) refers to a given position on the curve, with P(0) refering to the first vertex of the polygonal curve and P(N) refering to its last vertex, as shown in the following figure.

41

Figure 2: Parametrization of a polygonal curve

Now, suppose that the man of the previous example is constrained to walk on a polygonal curve P of length N and that his dog has to walk on a polygonal curve Q of length M. The position of the man and the dog on their respective curve can be modelized as a function of t by using two continuous and increasing function α(t) and β(t), where α(0) = 0, α(1) = N, β(0) = 0 and β(1) = M. The position of the man as a function of time is given by P(α(t)) and the position of the dog by Q(β(t)). While many functions α(t) and β(t) can be used, some of them will cause the distance between the dog and the man to be large at some instant, whereas other functions will not. Finding the Fréchet distance actually corresponds to find two parametrization functions α(t) and β(t) that minimize the maximum distance between the man and his dog.

The following applet illustrates the effect of α(t) and β(t) on the maximum distance. Select one of the two examples, click on the run button and observe how the cursors move along the curves. You will notice that the maximum distance between the two cursors is not the same for both examples.

Mathematically, the Fréchet distance between two curves is defined as

51

(eq. 2)

where α(t) and β(t) range over continuous and increasing functions with α(0) = 0, α(1) = N, β(0) = 0 and β(1) = M only.

This equation reads like: for every possible function α(t) and β(t), find the largest distance between the man and its dog as they walk along their respective path; finally, keep the smallest distance found among these maximum distances.

For simplicity, we’ll take d(a,b) as the Euclidian distance between a and b.


3. Computing the Fréchet distance between two polygonal curves

3.1 Decision problem

In a first time, let’s consider an easier variant of the problem, namely the following decision problem:

Given: polygonal curves P and Q and some ε ≥ 0

Decide: whether δF(P,Q) ≤ ε

Throughout the rest of this tutorial, let P:[0,p] and Q:[0,q] be polygonal curves. Therefore, p and q are the number of edges of P and Q respectively. In order to solve the decision problem, let us first consider the case where p = q = 1, i.e. P and Q are just simple segments. We define

61

(eq. 3)

that describes all pairs of points, one on P, one on Q, whose distance is at most ε. The following figure shows line segments P, Q, a distance ε > 0, and Fε being the white area within the unit square; subsequently called the free space.

71

Figure 3: Free space of two segments

As the distance between the two green points on the curves is smaller than ε, the corresponding green point in the unit square belongs to the free space. Conversely, the blue point in the unit square does not belong to the free space, as the distance between the two corresponding blue points on the curves is larger then ε. Also, the free space corresponding to two line segments is the intersection of the unit square with an ellipse, possibly degenerated to the space between two parallel lines. The proof of this can be found in [1]. The important thing to remember here is that the free space of two line segments is convex.

We can extend equation 3 to arbitrary polygonal curves P, Q, of length p, q respectively:

81

(eq. 4)

The free space corresponding to the curves P and Q is the combination of the free spaces of all pairs containing one segment of P and one segment of Q. The following figure shows two polygonal curves and their corresponding free space for a given value of ε.

91

Figure 4: Free space of two polygonal curves

The algorithm is based on the following straightforward observation:

For polygonal curves P and Q we have δF(P, Q) ≤ ε, exactly if there exists a curve within the corresponding Fε from (0, 0) to (p, q) which is monotone in both coordinates.

In the man-dog example, the monotonicity condition corresponds to the fact that none of them is allowed to go backwards. Figure 3 shows such a curve proving that for that example the Fréchet distance is lower than ε.

Now, let Fε(i, j) with 1 ≤ ip, 1 ≤ jq refer to the free cell corresponding to the segment P(i – 1) P(i) and the segment Q(i – 1)Q(i). What we are interested in at this point is to find the intersection of the free space with the contour of each Fε(i, j). If the free space intersects the edge of a cell, it means that it is possible to enter in that cell through this edge. As the free space of each unit cell is convex, it is an easy matter to determine if there is a monotone curve that goes from (0,0) to (p, q) by computing all the intersections of the free space with the contour of each cell. The following figure illustrates the values that must be calculated in order to identify the passages between neighboring cells. LFi,j refers to the left line segment bounding Fε(i, j) and BFi,j refers to the bottom line segment.

101

Figure 5: Values that must be calculated

Furthermore, we define

122

(eq. 5)

The following figure shows the difference between Fε and R ε. One can see that all the zones that were not reachable by a monotone curve in Fε have been removed in Rε. Hereafter, Rε will be referred to as the monotone free space.

131

Figure 6: Monotone free space of two polygonal curves

Rε can easily be computed from Fε. We will use almost the same notation as in figure 5 above to designate the values that have to be calculated for the monotone free space. That is, LRi,j will refer to the left line segment bounding Rε(i, j) and BRi,j will refer to the bottom line segment.

3.1.1 Algorithm for solving the decision problem

3.1.2 Complexity

The values of figure 5 can be calculated in constant time for each free space cell by using any algorithm that computes segment-circle intersections. Therefore, if polygonal curves P and Q respectively have p and q segments, then the previous algorithm decides in O(pq) time, whether δ F(P,Q) ≤ ε.

3.2 Finding the Fréchet distance

Now, let us consider the problem of really computing the Fréchet distance. Assume that we start with ε = 0 and continuously increase ε. Then the free space becomes larger and larger and we want to determine the smallest ε such that it contains a monotone curve from (0,0) to (p,q). Observe that this can occur only in one of the following cases:

Clearly, case a) states that δF(P,Q) must be greater or egal to the distance between the starting points of P and Q and also greater or egal to the distance between their ending points. Case b) corresponds to the opening of a new passage between two neighboring cells and case c) corresponds to the opening of a new vertical or horizontal passage within the diagram. The following figure illustrates case c) for a horizontal passage.

141

Figure 7: Horizontal passage that opens

The following figure shows the geometric meaning of ε in cases a), b) and c).

152

Figure 8: Geometric meaning of critical values

There are O(p2q + pq2) such critical values of ε, namely the distance between starting points and endpoints of P and Q (case a), the distances between vertices of one curve and edges of the other (case b), and the common distance of two vertices of one curve to the intersection point of their bisector with some edge of the other (case c). Each one of these values can be computed in O(1) time. So, we obtain the following algorithm for computing δF(P,Q).

3.2.1 Algorithm for finding the Fréchet distance

3.2.2 Complexity

We observe that the runtime is majorized by the one of step 2, which is O((p2q + q2p)log(pq)).


4. Applet

The following applet illustrates the algorithm. For simplicity reasons, the implemented algorithm doesn’t calculate the critical values of ε that correspond to case c).

Instructions for using the applet


5. References

  1. H. Alt and M. Godau, Computing the Fréchet distance between two polygonal curves, Internat. J. Comput. Geom. Appl., 5:75-91, 1995
  2. H. Alt, C. Knauer and C. Wenk, Matching polygonal curves with respect to the Fréchet distance”, Proc. 18th Int. Symp. Theoretical Aspect of Computer Science (STACS):63-74, 2001, Dresden, Germany
  3. H. Alt, C. Knauer and C. Wenk, Bounding the Fréchet distance by the Hausdorff distance”, Proc. 17th European Workshop on Computational Geometry., 166-169, 2001, Berlin, Germany

Leave a Reply

Be lost in thought… is powered by Nineye in WordPress | Entries RSS and Comments RSS

Copyright © 2009. All right reserved by Nineye.