Constructing concepts
An example from a research paper
There is a paper
concerning the vectorization of raster images of (black and white)
line drawings by Dov Dori and Liu
Wenyin. Before they get into the pseudo-code of their algorithm they
present a sort of step by step construction of concepts that are used
to explain the algorithm.
Here
is an excerpt (via copy and paste)
(1)
Black pixel—a
foreground pixel of graphic objects in the image.
(2)
Black pixel area—the
foreground area of graphic objects in the image.
(3)
Bar—a
straight line segment with non-zero width, which occupies a (possibly
slanted) black pixel area in the drawing image. The ideal area is an
oval consisting of a rectangle and two semi-circles, as shown in
Figure 111(a).
(4)
Arc—a
circular line segment with non-zero width, which occupies a (possibly
closed) black pixel area in the drawing. The ideal end points are
also round capped, as shown in Figure 111(b).
(5)
Polybar—a
chain of two or more equal width bars linked end to end. Bar widths
are considered equal if they are within a prescribed tolerance.
(6)
Polyline—a
polybar resulting from approximating an arc or a free curve by a
crude vectorization procedure.
(7)
Wire—a
generalization of bar, circular arc and polybar, which is used to
denote any line-like graphic object in the drawing image.
(8)
Scan Line—a
one-pixel-wide horizontal line that crosses the drawing image from
left to right, used to find the first black pixel of a black pixel
area. Scan lines are spaced vertically every n
pixels, where n
is smaller than the
minimal anticipated line width.
(9)
Current Pixel—the
pixel at which the tracking has arrived at a given point in time.
(10)
Pixel Visiting
Direction—one of
the four directions left, right, up or down, to which the pixel
following the current pixel is visited.
(11)
Directed Run—a
sequence of consecutive horizontal or vertical black pixels in the
pixel visiting direction from the first encountered black pixel to a
stop pixel (defined below).
(12)
Directed Run Sign—a
sign which is negative when the pixel visiting direction of the
directed run is left or up, and positive when it is right or down.
(13)
Stop Pixel—the
first pixel along a directed run which either encounters a white
pixel or exceeds a predefined maximal directed run length.
(14)
Run—the
undirected, unsigned sequence of consecutive black pixels formed by
linking two directed runs with opposite directions that start from a
common point.
(15)
Middle Run Point—the
middle point of a run. If the run length is even it is taken as the
left or top of the two middle points.
(16)
Length Direction—a
pixel visiting direction whose associated run is longer than the
corresponding run in the orthogonal direction. The length direction
is horizontal if the slant of the line at the current point is up to
45 degrees
and vertical otherwise.
(17)
Width Direction—the
pixel visiting direction orthogonal to the length direction of the
associated run.
(18)
Tracking Step—a
directed run of a specified length in the length direction. A
tracking step is taken to search for the next medial axis point from
the current medial axis point.
(19)
Width Run—a
run in the width direction.
(20)
Length Run—a
run in the length direction.
(21)
Tracking Cycle—a
cycle of finding the next middle run point, which consists of taking
a tracking step, making
the width run, and then obtaining the middle point of the width run.
Remarks
Let’s
make several remarks
-
It’s rather rare to have such a presentation of concepts before an algorithm is detailed!
-
The presentation is more or less step by step and assertions are based on the concepts described in previous points.
-
However the statements refer to background assumptions that the reader is supposed to be able to fill in, given his/her general understanding.
-
The presentation is nevertheless informal, and an improvement might well be possible.
-
I can imagine that there is a formal construction that could be fed to a computerized “disciple” or “assistant” that could be inspired from the above natural language construction. However, an important point I don’t have time to go into here is that simple predicate calculus is not going to be sufficient.
-
A formal construction could be checked. I am not very familiar with systems that do so, but maybe something like the calculus of constructions by Thierry Coquand is worth looking at here. However I think that there is an aspect of disambiguation to investigate.
-
The above construction is one sided: there is no hint of a disciple asking the teacher questions.
This
is one of the sources for the research by Dori and Wenyin
http://Research.Microsoft.com/Pubs/68811/Sparse_Pixel_vectorization.doc
No comments:
Post a Comment