Scanbox Approach to Shape Reconstruction for Automated Reticle Inspection


The paper describes a generalization of the scanline approach [1] to reconstruction of the shape of planar object represented by a discrete point set with a given distance threshold d. This problem arises in applications of VLSI layout image processing, e.g., during automated reticle inspection.

Two algorithms are presented, with time complexities O(min {n, (n + t)log n}) and O(n log n), respectively, where n is the number of input points, t is the number of fixed-radius neighbor pairs. The first one is applicable for sparse point sets, e.g., for almost-grid point sets. In addition, ideas from the first algorithm, combined with the scanbox approach lead to the second algorithm efficient in the general case.



Determining the shape of a planar object represented by a finite set of its points is a well-known problem of image processing. Among the major reasons for its complexity is lack of formalization of the notion of "shape". A natural approach to computationally hard problems is to design efficient algorithms treating special cases suitable to specific application areas. Various applications pose different requirements to "shape". Examples of simplest, or coarsest definitions of shape are the bounding box of an object and the convex hull. These definitions of "shape" deliver a reasonable trade-off for goal vs. efficiency in packaging, VLSI floorplanning, etc. More sophisticated "shape" definitions may be found in [2] and [3], Sect. 43.2 "Extracting Shape from Dot Patterns". A number of known algorithms in fact construct a whole family of shapes depending of a parameter of the algorithm, usually related with mutual proximity of points.

In image processing applications a common special case is shape reconstruction for a set of points approximately placed on a regular grid with step d. Examples of such grids are pixels of the scanned image or probe locations under regular probing. In these cases the following definition is often suitable.

The shape of a finite point set is a maximal-area polygonal region with side lengths not exceeding d that contains the given point set. Variations of this basic definition allow for "rectification" of shape boundary according to some angle or deflection thresholds for data compression. This definition is stated to grasp the notion of the external shape, however it may be generalized to allow for non-simply-connected regions, i.e., regions with holes.

Two algorithms are presented for shape reconstruction under the above definition of shape for a planar point set. The time complexity of the first algorithm, unlike the second one, depends on the sparseness of the input point set.

Both algorithms have the following I/O specifications:

INPUT: A set of points {a_1, a_2, ..., a_n} in the plane and proximity threshold d.

OUTPUT: Polygonal regions defining the shape of the given point set according to the above definition.


Sparseness-Sensitive Algorithm


The limited-distance neighbor graph, or d-near neighbor graph for a point set {a_1, a_2, ..., a_n} is the graph with vertex set {a_1, a_2, ..., a_n} in which two vertices are connected by the edge if and only if the distance between them is at most d.


Figure 1. Conditions for Algorithm 1.



Sketch of Algorithm 1

Step 1. Construct the d-near neighbor graph for the input point set. The graph is represented by its adjacency structure.

Step 2. For each point a_i do:

Step 2a. Sort the neighbors of a_i counterclockwise.

Step 2b. If for a pair (a_j, a_k) of consecutive neighbors in the sorted list from 2a one of the following conditions hold:

Condition A: Vectors (a_i a_j) and (a_i a_k) form an angle greater than Pi;

Condition B: The distance between a_j and a_k exceeds d;
then the point a_i and the edges (a_i, a_j) and (a_i, a_k) are marked as exterior.

Step 3. Link marked points and edges into simple polygons.

Step 4. Delete polygons of length 4.

Step 5. Sort the remaining polygons into external boundaries and hole boundaries.


To estimate time complexity of Algorithm 1, we will go into further details.

Step 1 is equivalent to the problem of construction of the intersection graph for the set of circles in the plane of
radii d/2 centered at the points {a_1, a_2, ..., a_n}. We propose the following algorithm.

Step 1a. Construct the intersection graph of squares centered at points {a_1, a_2, ..., a_n}
with sides of length d and with sides parallel to coordinate axes.

Step 1b. For every pair of intersecting squares, check whether their inscribed circles intersect.

It is well-known that Step 1a may be executed in time O(n log n + k), where k is the number of intersections of squares.

Time complexity of the whole Step 1 is determined with the help of the following theorem.


Statement 1

Let {R_1, R_2, ..., R_n} and {S_1, S_2, ..., S_n} be the sets of squares of the same size with sides parallel of coordinate axes and the respective inscribed circles and let k and t be the numbers of intersections in the sets {R_i} and {S_i} respectively. Then k = O(t + n).

As a corollary, the complexity of Step 1 of Algorithm 1 is equal to O(n log n + t), where t is the number of edges of the d-near neighbor graph.

Time complexity for Step 2 is O(min {n, t log n}), [4].

Time complexity of steps 3, 4 is O(n).

In general case, it is known that time complexity of Step 5 (sorting polygons into external and internal boundaries) is O(n log n) [1]. However in our particular case the following statement allows us to reduce time complexity to O(n).


Statement 2

Let a_0 be the lowest leftmost vertex of a polygon constructed at Step 3 of Algorithm 1. If this polygon is an external boundary then Condition A was true for point a_0 at Step 2b, otherwise Condition B was true for a_0 at Step 2b.

Summing the estimates of all steps, we obtain that the worst-case time complexity of Algorithm 1 is

O(min {n, (n+t)log n}).

This algorithm is efficient for sparse point sets, for which t = Omega(n). In this case time complexity of the algorithm is reduced to O(n log n).

Scanbox Algorithm

For "dense" point sets, when the threshold d is taken too large resulting in large t, the following algorithm is proposed.

Consider the plane partitioned into square boxes with side d, i.e., into the sets

B_ij = {(x,y): id <= x < (i+1)d, jd <= y < (j+1)d}.

Let P_ij be the subset of the input points falling inside B_ij and let S be the set of all nonempty boxes B_ij.

Define the d-near neighborhood graph on the set S to be the graph with vertex set S in which two vertices S_i and S_j are connected by an edge if there are two points a_i, a_j from {a_1, ..., a_n} such that a_i is in S_i and a_j is in S_j and the distance between a_i and a_j is less than d. The following simple statement will be useful in the construction of the algorithm.

Statement 3.

If S_i and S_j are adjacent vertices in the d-near neighborhood graph of set S, then these boxes are 8-adjacent in the plane (i.e., adjacent in the horizontal, vertical, or diagonal direction).


Figure 2. Algorithm 2, Step 3.



Sketch of Algorithm 2

Step 1. Construct the d-near neighbor graph for boxes of S using the following sub-steps.

Step 1a. Sort input points into boxes B_ij.

Step 1b. Inside each nonempty box S_i construct the convex hull of points lying inside it.

Step 1c. For each pair of 8-adjacent boxes find the distance between the corresponding point sets.

Step 2. Construct the "shape" for the set S, e.g., using an algorithm similar to Algorithm 1 or by scanbox approach.

Step 3. For each pair of consecutive boxes from the boundary of the shape from Step 2 construct an appropriate common
tangent for convex hulls constructed at Step 1b.

Step 4. If the length of some tangent exceeds d or a pair of tangents intersect, remove the violation by moving the "touch"
points along the polygons. Finally, collect polygons from tangents, moved tangents and chains from convex
polygons from step 1b.

To estimate time complexity, note first that this partition into steps is made only for conceptual clarity. In fact most of them may be implemented in the course of a single scan through the plane by a pair of adjacent boxes. Bearing this in mind, let us estimate the separate operations of the algorithm.

Step 1a takes O(n) time. Step 1b is estimated by O(n log n) time [3]. At step 1c, the distance between a pair of convex polygons may be found in time O(log n) [3], totalling to by O(n log n) for the whole step.

At Step 2, constructing the neighborhood graph may be performed in time O(n log n) using Algorithm 1, since the corresponding vertex set is inherently "sparse" (every vertex may have at most 8 neighbors). However, as it is already noted, a simpler approach is possible using scanbox approach.

At Step 3, a common tangent for a pair of convex polygons may be constructed in time O(log n) using dichotomy approach described, e.g., in the divide-and-conquer approach for convex hull construction [5].

Finally, Step 4 is clearly O(n), which gives the total estimate of O(n log n) for time complexity of the Scanbox Algorithm.



  1. V. Feinberg. Geometrical Problems of VLSI Computer Graphics, Radio i Sviaz, Moscow, 1987.
  2. Generalized Convexity Approach to Cell Boundary Shape Approximation, TCAD Driven CAD, 8 (1997), no. 9, pp. 8-9.
  3. Handbook of Discrete and Computational Geometry, eds. J.E. Goodman and J. O'Rourke, CRC Press, 1997, 991p.
  4. J. E. Goodman and R. Pollack, Multidimensional sorting, SIAM J. Comput. 12 (1983) 484-507.
  5. F.P. Preparata and M.I. Shamos, Computational Geometry: An Introduction. Springer-Verlag, New York, 1985.