October 1992 -- README for Filtering, Segmentation, and Depth code package
Copyright (c) 1990,1,2 Mark Nitzberg
and President and Fellows of Harvard College
All rights reserved.
The code in this directory is provided with all the usual disclaimers.
It depends on a vision utilities package distributed by
the Harvard Robotics Laboratory called HVision.
For a copy, contact George Thomas, thomas@metatron.harvard.edu.
Like all vision utilities built from scratch, it has its own
file format and coding idioms that are hardly self-evident.
The header file "hvision.h" should explain some.
FILTERING
The pseudo-C code for the nonlinear filter is in filter.pseudo-c.
You will have to make serious additions (#define Width and Height
and typedef double IMAGE[Width][Height], input the image, deal
with border issues) before it will run.
The C code for the actual experimental filter, which will run
about 5 times slower than the pseudo-code can be made to run,
is in filter.c.
SEGMENTATION
The programs are
curves -- find contours, jumping small gaps, produce an edge list
smooth -- smooth an edge list
contin -- find an optimal set of continuations for disrupted
contours from an edge list and image, producing a smoothed
edge list and region map
pscurves -- produce a postscript file from an edge list
DEPTH RECOVERY
sp -- produce a segmentation with overlaps given a smoothed edge
list and region map and input image. Produces postscript.
The programs explain themselves when invoked with no arguments,
or with the -help argument.
Test images from the book are in the "im" directory.
Questions and comments to Mark Nitzberg, nitzberg@math.harvard.edu.
Good luck. A "Demo" file that gives the invocation parameters
for the book figures will be coming soon.
Early notes:
-----------------------------------------------------------------------
January 1990 -- intermediate file formats for 2.1-d system.
curves foo.im ( gives foo.e ) finds high-gradient edges,
jumps gaps, finds T's & corners
smooth foo.e [-c foo.c] smooths (mindful of T's) & rejoins junctions.
output ( foo.es ) es for edgemap smoothed.
or ( foo.ec ) If -c option given, uses the contin.
( foo.r ) pairing in foo.c (see contin)
for interpreting the T's, then adds
the continuations and puts out a new
region map as well.
contin foo.es ( foo.c ) finds consistent continuation pairings
-- consults foo.im for region intensities
c for continuation list
sp foo.r ( foo.sp ) Final segmentation.
sp is for stacked pictures.
Result is postscript.
Print utility:
pscurves foo.e* ( foo.e*.ps ) converts '.e' or '.ec' files to postscript.
---------------------------------------------------------------------------
F I L E F O R M A T S
1. Input image
foo.im foo.bw foo.i
Image files. Any format that hvReadImage() can read.
To obtain the HVision image utilities, contact
George Thomas, thomas@gramian.harvard.edu.
2. Edge map
foo.e, foo.es, foo.ec
Lines beginning with # are comments, except "# CONTOUR" lines.
First data line in the file tells the size of the pixel space
width height
Then point lists follow; information for 1 point per line,
with blank lines between point lists.
Each line gives four floating point (C scanf %g) numbers
x y theta kappa
(x,y) are the pixel coordinates (non-optional).
theta is an optional edge tangent angle in degrees (sig. only mod 180),
and kappa is an optional curvature estimate.
Each point list must be preceeded by a special comment line,
which begins with "# CORNER ".
# CONTOUR points [ words ... ]
Here, is the number of points in the contour.
The optional words are:
CLOSED indicates a closed contour
WITHCORNER first point of the contour is a corner point
(applies only to closed contours)
CONTINUATION this is a continuation contour
BOUNDARY this contour is actually part of the
image boundary
3. Continuation list
foo.c ( an optional number)
Lines beginning with # are comments.
Each line gives one continuation pair as four numbers,
until end of file. This gives a "complete" set of continuations.
c1 end1 c2 end2
c1 and c2 are contour numbers, (0 ... NContours-1)
which correspond to the order in which contours appear
in the corresponding .e or .es file.
end1 and end2 can be 0 or 1 as the endpoint
is the first or last point in the contour.
Again, the first point of a contour is the one
listed first in the corresponding .e or .es file.
4. Region graph
foo.r
Gives info on edges and regions.
`nodes' are edge endpoints, implicitly numbered 1 .. N.
E D G E S :
i j a b length curvature o1 o2
...
one line per edge; implicitly numbered 1 ... E.
This edge goes from node i to j, region a on the left and b on
the right (1 ... R), with given arc length and total square of
curvature. o1 and o2 are angles, in degrees, of the tangents
of the contour at nodes i and j. These tangents must point
outward at the contour endpoints.
*** See NOTES below.
A blank line separates edges from regions.
R E G I O N S :
area mean
...
the area of the region, and its mean intensity.
Regions are implicitly numbered 1 ... R.
NOTES: * Region number 0 means `outside the image'.
Use region 0 to identify edges that comprise the image border.
* If region 0 never occurs, then region 1 is taken to be the surround.
That is, region 1 is the one that touches the picture border everywhere.
* Loops (edges with the same node at both ends) are forbidden.
If your graph has a loop, add a node.
* Regions _may_ have holes.
* Several edges _may_ share the same pair of nodes.
* Nodes _may_ have as few as two edges (corners).
* An edge shared by two regions which are the same region
will be deleted during pre-processing.
5. Final output--stacked pictures
foo.sp
Lines beginning with '#' are comment lines.
Each data line is a list of blank-separated region numbers,
denoting the regions in one layer.
The first line gives the bottom layer, which is usually
the whole domain, and each successive line gives a layer
`closer' to the viewer.
Notes on the "curves" program.
Input: a monochrome image I with SNR >~ 10, pixel values 0-255 for black-white.
------------------------------------------
curve detection, "curves" program
Compute Ix^2, IxIy, and Iy^2 using nearest neighbors, into 3 images.
Convolve each of these images with a circular gaussian of
effective diameter 8 Sigma.
/ Ix^2 IxIy \
Now at each point Xo, the matrix A(Xo) = ( )
\ IxIy Iy^2 /
gives edge direction, strength and coherence as follows.
For edge strength, we can use sqrt(Ix^2+Iy^2).
Let e1,e2 (e1 < e2) be the eigenvalues of A, and v1 and v2
the corresponding eigenvectors.
sqrt(e1/e2) measures edge coherence (eccentricity),
as the short axis of the ellipse ~ 1/sqrt(e2),
and the long axis ~ 1/sqrt(e1).
v2 gives edge direction, and e1 is a measure of cornerness
within a diameter d neighborhood.
These values are stored in four new images,
Strength, Edgex and Edgey, Cornerness.
Edgex and Edgey are the components of v2 normalized to unit length.
Using the hysteresis principle of Canny's edge-finding algorithm,
set a high threshold T such that E(T) = { Xo | Strength(Xo) > T }
has no more than, say, 40% of all pixels,
and a low threshold t that E(t) captures, say, 70% of all pixels.
This corresponds to low & hi of 30 and 60 percentile.
The low and high thresholds are parameters given in percentiles.
Now for each Xo in E(T) such that Strength(Xo) is a local maximum
in the +/-v2(Xo) perpendicular directions,
start a new contour structure.
Build the contour in two passes, one for each direction
v2(Xo) and -v2(Xo). Move by one pixel unit distance at a time,
ie., to Xo+v2(Xo), and add this point to the contour
structure, repeating until the point is not in E(t).
Then pursue the other direction, -v2(Xo), similarly.
When finished with this contour, go on to the next Xo in E(T)
that is a local maximum perpendicular to the edge direction.
The contour structure has beginning and end pointers and is
a doubly-linked list.
Output:
`.e'
A list of curves. Each curve is a list
of points, each with (x,y)=location, (xt,yt)=tangent
and k=curvature, all in floating point. The point list
for one curve is separated from the next by a blank line.
The location (x,y) is in pixel units, but may not fall on a pixel exactly,
with (0,0) for the upper-left of the image, x increasing right
and y increasing downward.
The tangent is a vector of unit length giving the local estimate
of slope of the edge at (x,y). Its direction is (arbitrarily)
towards the next point on the curve, and away from the previous.
The curvature k is a local estimate of 1-dim curvature of
the edge at (x,y) in inverse units of distance, (1/pixelwidth)s.
It represents 1/r where r is the radius of an osculating
circle that best fits the edge at (x,y). The value can be "verified"
by computing the change in tangent angle over the distance between points.
Thus k=0 means the edge is straight at (x,y), |k|=1 describes
a hairpin turn around a neighboring pixel. Larger values
for k are unreasonably high and really indicate "corners".
Positive values of k indicate a right turn as one travels
to the next point (ie. clockwise); negative values indicate
a left turn.
The "pscurves" program produces a postscript version of an edge map.
The program "pscurves" puts out postscript that
draws the curves in postscript.
This can be printed atop the original image with
laserhs -p image; (cat image.ps; curves2ps image.curves) | lpr