next up previous
Next: MR Images of Up: Integrating Knowledge in Previous: Knowledge Sources

Knowledge Applications

At first glance, one might consider knowledge to be the solution itself, but the knowledge we extract may not be easily used, on its own, for a task. Since the theme of knowledge based clustering is integration, we use both knowledge and clustering as equal tools. It is important to apply abstracted knowledge properly.

A. Combination of Over-Clusters: The basic idea behind knowledge based clustering is to iteratively cluster and label regions of interest produced by over-clustering. The term over-clustering means that at the initial clustering step, we deliberately cluster the data set into a greater number of classes than the number of objects in the data set. This inherently leads to some level of over-segmentation, but it also reduces both the chance and frequency that different classes are combined into one class. This step is necessary because different classes may be very close in some features and have a higher tendency to be under-segmented. Also, combining clusters that belong together is a much simpler task than splitting up those that do not.

The primary source of knowledge for merging over-segmentation comes from the pattern distribution in feature space. This knowledge allows us to identify partial clusters and merge them into proper classes. Certain heuristics are relatively obvious, such as the fact that the cluster centers of split classes are next to one another when projected in feature space, but there is usually a wide variety over image slices; regions of a class that are split up in one case may not necessarily be split in another. A working knowledge based system must be able to handle different types of splitting of true clusters. Given an unusual split, a decision will, ideally, be delayed until other processing modules provide information. The knowledge also serves a labeling purpose, since merged clusters have been identified.

B. Focus-of-attention: Merging and labeling over-segmentations into an accurately segmented image is the overall goal of knowledge based clustering. ``Focus-of-attention'' is the key to accomplishing the goal. We define focus-of-attention as the process of using knowledge to identify regions of interest and ``zoom'' in on them iteratively. This allows us to isolate the region and perform further clustering. An isolated region will have both less patterns to cluster and, generally, a smaller number of clusters to break the region into. Since it is an iterative process, it is possible for these subclusters to need additional focusing. We refer to clusters/patterns selected by this process as ``focus-of-attention clusters/patterns.''

Generally, both pattern distribution and domain specific knowledge are used. When this knowledge causes focus on a certain spatial area, the makeup, including the number of classes there are in the chosen patterns and possibly a small number of prototypical patterns of a class is either known or estimated. The prototypical patterns of a class provide seed patterns of that class for initializing the reclustering algorithm and allow input patterns to be labeled as the class of prototypical patterns in its final cluster. Focus-of-attention also locates partial patterns belonging to a known object, which leads to reclustering regions of interest with these vectors (patterns) to guide initialization.

C. Cluster Initialization: Instead of having human experts provide training patterns as initialization for the clustering algorithm, our concept applies the information gained from an initial clustering during the iterative reclustering stage. Since ``focus-of-attention'' identifies regions of interest and provides us with the number of classes and possible prototypical patterns, we can embed this knowledge into our clustering algorithm and make it more effective.

We used FCM in our experiments and in the following, we show how to embed this knowledge in FCM. We should note, however, that this idea may be implemented [7] for other clustering algorithms.

FCM is provided with (a1) patterns of interest, a data set of p tuples of reals {, , ..., } to be clustered; (b1) number of expected classes c ; and (c1) prototypical patterns of a class t or several classes, which are a subset of patterns of interest. Labeled or prototypical patterns may be provided initially or identified by domain knowledge during reclustering. FCM creates c fuzzy sets and a set of c initial class centers {, , ... , }, each being p tuples of reals. The fuzzy set has the form = {, , ..., }, in which represents that belongs to the class with a membership grade . The c fuzzy sets form the rows of a fuzzy partition matrix . The row is initialized so that the membership grade of each position corresponding to a known initial pattern of class t is 1 and the grades of all other positions of the row are set to be 0. Other rows of the matrix are randomly initialized so that each row contains at least two 1's and the summation of membership grades of any column is equal to 1. Defuzzification of the matrix is done to achieve labeling after clustering. Prototypical patterns from more than one class can be similarly implemented for multiple rows of known classes. Other parameters of the algorithm used here are : m=2, , and is the Euclidian norm.



next up previous
Next: MR Images of Up: Integrating Knowledge in Previous: Knowledge Sources



Matthew &
Wed Mar 8 10:11:57 EST 1995