Wavelet Introduction (6)

So far, the “killer app” for wavelets has been digital image compression. They are central to the new JPEG-2000 digital image standard and the WSQ (wavelet scalar quantization) method that the FBI uses to compress its fingerprint database. In this context, wavelets can be thought of as the building blocks of images. An image of a forest can be made from the broadest wavelets: a big swath of green for the forest, a splash of blue for the sky. More detailed, sharper wavelets can help distinguish one tree from another. Branches and needles can be added to the image with even finer wavelets. Like an individual brush stroke in a painting, each wavelet is not itself an image, but many wavelets together can recreate anything. Unlike a brush stroke in a painting, a wavelet can be made arbitrarily small: A wavelet has no physical size limitations because it is simply a series of 0s and 1s stored in a computer’s memory.

Contrary to popular belief, wavelets themselves do not compress an image: Their job is to make compression possible. To understand why, suppose that an image is encoded as a series of spatially arranged numbers, such as 1, 3, 7, 9, 8, 8, 6, 2. If each number represents the darkness of a pixel, with 0 being white and 15 being black, then this string represents some kind of gray object (the 7s, 8s, and 9s) against a light background (the 1s, 2s, and 3s).

The simplest kind of multiresolution analysis filters the image by averaging each pair of adjacent pixels. In the above example, this results in the string 2, 8, 8, 4: a lower-resolution image that still shows a grayish object against a light background. If we wanted to reconstruct a degraded version of the original image from this, we could do so by repeating each number in the string: 2, 2, 8, 8, 8, 8, 4, 4.

Suppose, however, that we wanted to get back the original image perfectly. To do this, we would have to save some additional information in the first step, namely a set of numbers that can be added to or subtracted from the low-resolution signal to obtain the high-resolution signal. In the example, those numbers are -1, -1, 0, and 2. (For example: Adding -1 to the first pixel of the degraded image, 2, gives 1, the first pixel of the original image; subtracting -1 from it gives 3, the second pixel of the original image.)

Thus the first level of the multiresolution analysis splits the original signal up into a low-resolution part (2, 8, 8, 4) and a high-frequency or “detail” part (-1, -1, 0, 2). The high-frequency details are also called the Haar wavelet coefficients. In fact, this whole procedure is the multiresolution version of the wavelet transform Haar discovered in 1909.

It might not seem that the first step of the wavelet transform has gained anything. There were eight numbers in the original signal, and there are still eight numbers in the transform. But in a typical digital image, most pixels will be very much like their neighbors: Sky pixels will occur next to sky pixels, forest pixels next to forest pixels. This means that the averages of nearby pixels will be almost the same as the original pixels, and so most of the detail coefficients will either be zero or very close to zero. If we simply round those coefficients off to zero, then the only information we need to keep is the low-resolution image plus a smattering of detail coefficients that did not get rounded off to zero. Thus, the amount of data required to store the image has been compressed by a factor of almost 2. The process of rounding high-precision numbers into lower precision numbers with fewer digits is called quantization (the “Q” in “WSQ”). An example is the process of rounding a number to two significant figures.

The process of transforming and quantizing can be repeated as many times as desired, each time decreasing the bits of information by a factor of almost 2 and slightly degrading the quality of the image. Depending on the needs of the user, the process can be stopped before the lower resolution starts to become apparent, or it can be continued to obtain a very low-resolution “thumbnail” image with layers of increasingly accurate details. With the JPEG-2000 standard, one can achieve compression ratios of 200:1 without a perceptible difference in the quality of the image. Such wavelet decompositions are obtained by averaging more than two nearby pixels at a time. The simplest Daubechies wavelet transform, for instance, combines groups of four pixels, and smoother ones combine six, eight, or more.

One fascinating property of wavelets is that they automatically pick out the same features our eyes do. The wavelet coefficients that are still left after quantization correspond to pixels that are very different from their neighbors—at the edge of the objects in an image. Thus, wavelets recreate an image mostly by drawing edges—which is exactly what humans do when they sketch a picture. Indeed, some researchers have suggested that the analogy between wavelet transforms and human vision is no accident, and that our neurons filter visual signals in a similar way to wavelets.

小波(Wavelet)乐园

How Do Wavelets Work?