|
||||||||||
Scale-space representation: Definition and basic ideas(taken from http://www.nada.kth.se/~tony/cern-review/cern-html/node2.html)
Scale-space theory is a framework for early visual operations, which has been
developed by the computer vision community (in particular by Witkin , Koenderink
, Yuille and Poggio, Lindeberg and Florack ) to handle the above-mentioned
multi-scale nature of image data. A main argument behind its construction is
that if no prior information is available about what are the appropriate scales
for a given data set, then the only reasonable approach for an uncommitted
vision system is to represent the input data at multiple scales. This means that
the original signal should be embedded into a one-parameter family of derived
signals, in which fine-scale structures are successively suppressed (see
figure 1). How should such an idea be carried out in practice? A crucial
requirement is that structures at coarse scales in the multi-scale
representation should constitute simplifications of corresponding structures at
finer scales--they should not be accidental phenomena created by the method for
suppressing fine-scale structures. This idea has been formalized in a variety of
ways by different authors. A noteworthy coincidence is that similar conclusions
can be obtained from several different starting points. A main result is that if
rather general conditions are imposed on the types of computations that are to
be performed, then convolution by the Gaussian kernel and its derivatives is
singled out as a canonical class of smoothing transformations. The requirements
(scale-space axioms) that specify the uniqueness are essentially linearity and
spatial shift invariance, combined with different ways of formalizing the notion
that new structures should not be created in the transformation from fine to
coarse scales. In summary, for any N-dimensional signal ,
its scale-space representation
is defined by
Figure 2(a) shows the result of applying Gaussian smoothing to a one-dimensional signal in this way. Notice how this successive smoothing captures the intuitive notion of fine-scale information being suppressed, and the signals becoming successively smoother. Figure 3 gives a corresponding example for a two-dimensional image. Here, to emphasize the local variations in the grey-level landscape, local minima in the grey-level images at each scale have been indicated by dark blobs (with spatial extent determined from a certain watershed analogy, which essentially describes how large a region associated with a local minimum can be filled with water, without water flooding over to regions associated with other local minima). As can be seen, mainly small blobs due to noise and texture are detected at fine scales. After a small amount of smoothing, the buttons on the keyboard manifest themselves as distinct minima, whereas at even coarser scales they merge to one unit (the keyboard). Also other dominant dark image structures (such as the calculator, the cord and the receiver) appear as single blobs at coarser scales. This example gives one illustration of the types of hierarchical shape decompositions that can be obtained by varying the scale parameter in the scale-space representation. The relations between image structures at different scales induced in this way is referred to as deep structure .
|
||||||||||
© Lynne Grewe |