At first glance, one might consider knowledge to be the
solution itself, but the knowledge we extract may not be
easily used, on its own, for a task. Since the theme of
knowledge based clustering is integration, we use both
knowledge and clustering as equal tools. It is
important to apply abstracted knowledge properly.
A. Combination of Over-Clusters: The basic idea behind knowledge based clustering is to iteratively cluster and label regions of interest produced by over-clustering. The term over-clustering means that at the initial clustering step, we deliberately cluster the data set into a greater number of classes than the number of objects in the data set. This inherently leads to some level of over-segmentation, but it also reduces both the chance and frequency that different classes are combined into one class. This step is necessary because different classes may be very close in some features and have a higher tendency to be under-segmented. Also, combining clusters that belong together is a much simpler task than splitting up those that do not.
The primary source of knowledge for merging
over-segmentation comes from the pattern distribution in
feature space. This knowledge allows us to identify
partial clusters and merge them into proper
classes. Certain heuristics are relatively obvious, such
as the fact that the cluster centers of split classes are next to one
another when
projected in feature space, but there is usually a wide
variety over image slices; regions of a class that are split up in one case may
not necessarily be split in another. A working knowledge
based system must be able to handle different types of splitting of
true clusters. Given an unusual split, a decision will, ideally, be
delayed until other processing modules provide information. The knowledge
also serves a labeling purpose, since merged clusters
have been identified.
B. Focus-of-attention: Merging and labeling over-segmentations into an accurately segmented image is the overall goal of knowledge based clustering. ``Focus-of-attention'' is the key to accomplishing the goal. We define focus-of-attention as the process of using knowledge to identify regions of interest and ``zoom'' in on them iteratively. This allows us to isolate the region and perform further clustering. An isolated region will have both less patterns to cluster and, generally, a smaller number of clusters to break the region into. Since it is an iterative process, it is possible for these subclusters to need additional focusing. We refer to clusters/patterns selected by this process as ``focus-of-attention clusters/patterns.''
Generally, both pattern distribution and domain specific
knowledge are used. When this knowledge causes focus on a
certain spatial area, the makeup, including the number of
classes there are in the chosen patterns and possibly a
small number of prototypical patterns of a class is
either known or estimated. The prototypical patterns of
a class provide seed patterns of that class for
initializing the reclustering algorithm and allow input
patterns to be labeled as the class of prototypical
patterns in its final cluster. Focus-of-attention also
locates partial patterns belonging to a
known object, which leads to reclustering regions of interest with
these vectors (patterns) to guide initialization.
C. Cluster Initialization: Instead of having human experts provide training patterns as initialization for the clustering algorithm, our concept applies the information gained from an initial clustering during the iterative reclustering stage. Since ``focus-of-attention'' identifies regions of interest and provides us with the number of classes and possible prototypical patterns, we can embed this knowledge into our clustering algorithm and make it more effective.
We used FCM in our experiments and in the following, we show how to embed this knowledge in FCM. We should note, however, that this idea may be implemented [7] for other clustering algorithms.
FCM is provided with (a1) patterns of interest, a data
set of p tuples of reals {
,
, ...,
} to be clustered;
(b1) number of expected classes c ;
and (c1) prototypical patterns of a
class t or several classes, which are a subset of patterns of
interest. Labeled or prototypical patterns may be
provided initially or identified by domain knowledge during
reclustering. FCM creates c fuzzy
sets and a set of c initial class centers {
,
, ... ,
}, each being p
tuples of reals.
The
fuzzy set has the form
=
{
,
, ...,
}, in which
represents that
belongs to the
class with a membership grade
.
The c fuzzy sets form the rows of a fuzzy partition
matrix
. The
row is initialized so
that the membership grade of each position corresponding
to a known
initial pattern of class t is 1 and the grades of
all other positions of the row are set to be 0. Other
rows of the matrix are randomly initialized so that each
row contains at least two 1's and the summation of
membership grades of any column
is equal to 1. Defuzzification of the matrix is
done to achieve labeling after clustering.
Prototypical patterns from more than one class can be
similarly implemented for multiple rows of known classes.
Other parameters of the algorithm used here are : m=2,
, and
is the Euclidian norm.