In any generation, element i of the population is Vi, a
matrix
of cluster centers in FCM/HCM notation. The initial population of size
P is constructed by random assignment of real numbers to each of the s
features of the c cluster centers.
The initial values are constrained to be in the range (determined
from the data set) of the feature to which they are assigned, but are
otherwise random.
Since only the V's will be used within the GA it is necessary to reformulate the objective functions (3) and (4) for optimization. For HCM each data vector is assigned to the nearest cluster via some distance metric (Euclidean here). Given the way assignments to clusters are made, it is easy to see that
![]() |
(3) |
In order to work only with V's in FCM, equation (4) can be
rewritten by
substitution for U using the first order necessary condition for
U [2]. Specifically, for m > 1 as long as
, we can substitute
into (2), resulting in the reformulated FCM functional
![]() |
(4) |
Hathaway and Bezdek [6] have shown that local (V) minimizers of Rm and (U) at (3) produce local minimizers of Jm, and conversely, the V part of local minimizers of Jm yields local minimizers of Rm. Our approach will be to optimize Rm with a GGA. The reformulation theorem given in [6] provides theoretical justification for this technique.
The genetic guided algorithm is shown in Figure 1. It consists of selecting parents for reproduction, doing crossover with the parents and applying mutation to the bits of the children.
In the process of experimenting with hard c-means we
occasionally observed GGA getting caught in an
extrema associated with a degenerate partition (i.e a partition
with one or more empty rows meaning that less than c clusters were
obtained in the final partition). A row i is considered empty if
. To minimize the chance of the GA
becoming stuck at
a degenerate partition, we use the following heuristic.
The following subsections describe our approach to selection, crossover, and mutation. In each case our choices are motivated by the desire to apply genetically guided clustering to image data sets, which are usually large. So, small populations and minimal generations are preferred.