Texture discrimination using multimodal wavelet packet subbands

The subband histograms of wavelet packet bases adapted to individual texture classes often fail to display the leptokurtotic behaviour shown by the standard wavelet coefficients of 1natural' images. While many subband histograms remain leptokurtotic in adaptive bases, some subbands are Gaussian. Most interestingly, however, some subbands show multimodal behaviour, with no mode at zero. In this paper, we provide evidence for the existence of these multimodal subbands and show that they correspond to narrow frequency bands running throughout images of the texture. They are thus closely linked to the texture's structure. As such, they seem likely to possess superior descriptive and discriminative power as compared to unimodal subbands. We demonstrate this using both Brodatz and remote sensing images.


INTRODUCTION
The analysis of the statistics of wavelet coefficients has for the most part focused on 'natural images' and standard wavelets [1,2,3,4,5].Such statistics are necessarily mixtures from a variety of sources, both because 'natural images' consist of regions corresponding to many different entities, and because standard wavelets mix frequency bands that may have very different behaviours individually.The statistics of coherent sets of images analysed with bases other than standard wavelet bases may therefore be very different from the leptokurtotic histograms found in such mixtures.
The subband histograms of wavelet packet bases adapted to coherent classes of texture [6,7,8] confirm this intuition.Brady et al. [6] use Gaussian models in which the covariance is assumed diagonal in at least one wavelet packet basis.This basis is learned from examples, and adapts to each texture modelled.Spatial dependencies are captured by the basis itself rather than explicitly as in hidden Markov tree models [4,5].The resulting subband histograms fall into three classes.Many subbands still show leptokurtotic behaviour.Others, usually those with a smaller frequency range, show Gaussian behaviour.Most interestingly, some of the subbands with the narrowest frequency content are multimodal, with no mode at zero.Intuitively, the presence of a mode at zero indicates that the wavelet representation is sparse.This is true of 'natural' images, in which there are large regions that are relatively flat compared to the edges present, but it is no longer true when a phenomenon occurring throughout the images of a given class is focused on by a particular subband.
Although the multimodal subbands were discovered using Gaussian models, these are clearly unsuitable given the observed statistics.In [7,8], a new model was developed incorporating the multimodal behaviour.This model is briefly described in section 2. It confirms and refines the results of the Gaussian models, and provides hints of even more complex and structured behaviour [7].
The aim of this paper is to demonstrate the existence and discriminatory power of multimodal subbands.First, in section 3, we provide theoretical and empirical evidence for the existence of multimodality and show that multimodality is strictly related to the main periodicities characterizing the texture's structure.Then, in section 4, we demonstrate the discriminatory power of multimodality using both Brodatz textures and textures taken from remote sensing imagery.We summarize in section 5.

MODEL FOR MULTIMODAL SUBBANDS
In [8], a model was developed to include the newly observed multimodal subbands.In this model, each subband is modelled by one of three distributions: Gaussian (G); generalized Gaussian (GG); or a constrained mixture of three Gaussians (MoG).The full model is parameterized by the following data: a dyadic partition of one quadrant of the Fourier domain, T , which, given a mother wavelet, defines a wavelet packet basis; a map µ from T to a set M of three models, {G, GG, MoG}, giving the model used in each subband; a map θ from T to the space of model parameters for each subband.The subbands in T are assumed independent.
In [8], we describe how to find exact MAP estimates of T , µ, and θ from training data for a texture class, using an efficient depth-first search algorithm on the space of dyadic partitions.The resulting models capture the different statistics of the adaptive subbands, and in particular the multimodal statistics.Examples will be shown in section 4.
Note that the model is not limited to textures that shown multimodality: it is perfectly capable of modelling textures without them.When they are present however, it is in a position to take advantage of the extra information they provide.

Theory
Multimodal subband histograms, or more precisely, subband histograms with no mode at zero, are closely connected to the presence of structures running throughout the images of a texture class.The basic cause can be indicated very briefly, with an idealized model.
Suppose that a (1d) texture class consists of all translations of a sinusoid with fixed frequency k 0 and amplitude B. The mean histogram of a subband under a translation invariant distribution is equal to the marginal distribution of any wavelet packet coefficient from that subband.We will suppose, wlog, that the Fourier amplitude of such a wavelet packet coefficient at frequency k 0 is one.Then it is easy to see that the distribution of this coefficient takes the bimodal form Intuitively this arises from the fact that in a sinusoid of amplitude B, the values ±B occur 'more often' that other values (including zero).
As is demonstrated in [7], more complex models, involving several sinusoids with noise added to the amplitudes, blur but do not destroy the bimodality.These models demonstrate how periodicities in the texture can produce multi- modal subband histograms if the wavelet packet basis is sufficiently focused on the frequencies involved.

Experiment
Figure 2 shows the results of training the model on four textures, two from the Brodatz album, and two from aerial remote sensing images.The second row shows T and µ in graphical form, the lines indicating the dyadic partition, and the colours in each subband indicating the model assigned to that subband, dark grey being GG, light grey G, and white MoG.As can be seen, the multimodal subbands are closely connected to the periodicities present in the textures.The third row shows the histogram of a selected multimodal subband (blue/full line) and the model fitted to it (red/dashed).The double peaks are not as clear in the remote sensing images.This is partly due to the patches in the two images where there are no trees.If training is performed on samples without these 'holes', a more pronounced bimodality is found.
Further experiments show the existence of multimodal subbands in many Brodatz textures, and in remote sensing textures such as ploughed fields, planted forests, and certain configurations of buildings.

DISCRIMINATION AND DESCRIPTION
In all the experiments reported below, the following procedure was used.First, a model was trained for each of the two classes involved.Second, a subband was selected from one of the models, according to criteria that will be detailed in a moment.Call this subband S. A new model of subband S was then trained for the second class.Third, the undecimated wavelet packet coefficients in S were calculated for the test image.Fourth, for each pixel p, class probabilities Fig. 3. On the left is a mosaic formed of the Herring and Raffia Brodatz textures.On the right, the result of classification using one subband multimodal for Raffia, but unimodal for Herring.
were computed from the models of subband S only, using the coefficients in S belonging to a patch centred on p, of size equal to the subband filter size of S. Thus only the data lying in S was used to compute the probabilities.The differences of the log probabilities of each class form the 'probability map'.
Note that these experiments are not designed to produce the best possible classification results.The full model (i.e. using all the subbands) always produces a better result.What is remarkable is that the results of the above procedure using a subband that is multimodal for one texture but not for the other are often very good-indeed for Brodatz textures they approach the performance of the full modelwhereas the results using a subband that is unimodal for both textures are very poor.
Figure 3 shows a mosaic made of the Herring and Raffia textures from the Brodatz album.On the right is the result of applying the above procedure using a subband multimodal for Raffia, but unimodal for Herring.Similar results were obtained for other multimodal subbands.The same procedure was then repeated using several subbands of the same size, but unimodal for both textures.The results were all very poor, bearing very little if any connection to the mosaic.Indeed, in many cases the whole image was classified as belonging to one class.The average misclassification rates were as follows.Using Raffia multimodal subbands, the average pixel misclassification rate of Raffia was 14.9%, and of Herring was 0.7%.Using Herring multimodal subbands, Raffia was misclassified 5.1%, and Herring at 17.7%.Using unimodal subbands, the misclassification rate for Herring was 80% and for Raffia 6.5%.The last number is low for the reason that has already been mentioned: most of the unimodal results classified most of the image as Raffia.
Figure 4 shows reconstructions of the mosaic in figure 3 using one of the unimodal subbands (on the left) and one of the multimodal subbands (on the right).The left-hand image shows almost no trace of the mosaic structure, which accounts for the poor classification result.In contrast, the  image reconstructed from the multimodal subband clearly shows the mosaic structure, the amplitude of the periodicity being much larger in Raffia than in Herring.
Figure 5 shows a remote sensing image.Models were trained on 'ploughed field' using the upper part of the image, and on 'unploughed field' using the lower part.The above procedure was then followed for a subband multimodal for 'ploughed field' but unimodal for 'unploughed field'.The resulting probability map is shown on the right.The map indicates the ability of the multimodal subband to provide important information for discriminating between the two textures, and in particular to assign a consistently low probability to 'unploughed field'.
Figure 6 shows a second remote sensing image.Models were trained on the 'forest' texture in the lower part of the image, and on a 'background' class consisting of the topleft hand corner.On the right of the figure is shown the result of reconstructing the image using just the multimodal subband in the 'forest' texture.Note how the tree structure of the 'forest' texture has been captured, and how this same structure captures the presence of other trees in the image, while other areas show small response (the image has been linearly stretched for visualization).An exception is the bottom left-hand corner, where the arrangement and size of the buildings represents roughly the same periodicity Fig. 6.On the left, a remote sensing image.On the right, the reconstruction using a subband multimodal for the 'forest' texture in the lower part of the image.Fig. 7. On the left, a remote sensing image.On the right, the probability map resulting from the use of one subband multimodal for the 'poplar stand' texture on the right and unimodal for the 'forest' texture on the left.as the forest.Note that the subband used here is a standard wavelet subband.Analyzed using a Gaussian or generalized Gaussian model, this subband is not at all remarkable.Nevertheless, the new model was able to detect the multimodality in this subband, and hence capture the structure.
Figure 7 shows another remote sensing image.The classes were 'poplar stand', on the right of the image, and 'forest' on the left.On the right is shown the probability map resulting from the use of a subband multimodal for 'poplar stand', but unimodal for 'forest'.Note again the consistent assignment of a very low value to 'forest', and a much higher value generally to 'poplar stand'.

CONCLUSION
In this paper, we have provided evidence for the existence of multimodal wavelet packet subbands in textures, and have demonstrated their link to the characteristic structure of a texture.We have also demonstrated the descriptive and discriminative power of the model of these multimodal subbands developed in [8].The new multimodal statistics are only revealed when attention is shifted away from the universal behaviour displayed by the standard wavelet coefficients of whole images, which are necessarily mixtures of many components, towards the behaviour shown by the coefficients of bases adapted to individual components of these mixtures.
Practically speaking, while the classification maps obtained for real remote sensing images are not as accurate as for synthetic images, the probability maps provide useful information.For example, in cases where the classes are spectrally overlapped or otherwise poorly separable, a complex classification system could jointly exploit multispectral information and the information contained in this probability map to obtain a more accurate classification result.

Fig. 1 .
Fig. 1.An example of the MoG model used to model multimodal subbands.

Fig. 2 .
Fig. 2. Some textures, their optimal decompositions (dark grey = GG, light grey = G, white = MoG), and examples of multimodal subband histograms and the models fitted to them.

Fig. 4 .
Fig. 4. On the left, the reconstruction of the mosaic in figure 3 using a single unimodal subband.On the right, the reconstruction of the mosaic using a single multimodal subband.

Fig. 5 .
Fig. 5. On the left, a remote sensing image.On the right, the probability map resulting from the use of one subband multimodal for the 'ploughed field' texture, but unimodal for the 'unploughed field' texture.