Next: Genetically guided clustering
Up: The Case for Genetic
Previous: Introduction
Consider a set of n vectors
to be
clustered into c groups of like data. Each
is a
feature vector consisting of s real-valued measurements describing the
features of the object represented by
. The features
could be length, width, color etc. Hard or fuzzy clusterings of the
objects can be represented by a hard/fuzzy membership matrix
called a hard/fuzzy partition. The set of all
non-degenerate
hard (or crisp) partition matrices is denoted by Mcn and is
defined as
The set of all
non-degenerate
constrained fuzzy partition matrices is denoted by Mfcn and is
defined as
The clustering criterion used to define good clusters for hard c-means
partitions is the HCM function:
|  |
(1) |
where
is a hard partition matrix;
is a matrix of prototype
parameters (cluster centers)
; and
is a measure of the distance from
to
the ith cluster prototype. The Euclidean distance metric is used
for all HCM results reported here. Good
cluster structure in X is taken as a (U,V)
minimizer of (2). Typically, optimal (U,V) pairs are sought using
an alternating optimization scheme of the type generally described in
[5,2].
The clustering criterion used to define good clusters for fuzzy
c-means partitions is the FCM function:
|  |
(2) |
where
is a fuzzy partition matrix;
is
the weighting exponent on each fuzzy membership;
is a matrix of prototype
parameters (cluster centers)
; and
is a measure of the distance from
to
the ith cluster prototype. The Euclidean distance metric
is used
for all FCM results reported here. The value m=2 is used for all FCM
results given in this paper. The larger m is, the fuzzier the partition. Good
cluster structure in X is taken as a (U,V)
minimizer of (2). Typically, optimal (U,V) pairs are sought using
an alternating optimization scheme of the type generally described in
[2].
Next: Genetically guided clustering
Up: The Case for Genetic
Previous: Introduction
Larry Hall
5/26/1998