A genetically guided approach to clustering was described. Results from genetic guided minimization of the FCM and HCM functionals were given. The GGA algorithm was applied to three small data sets. All experiments were carried out with at least 50 different initial populations to get a statistically meaningful average and standard deviation for the GGA clustering approach.
The GGA approach with the FCM functional finds the best local extrema for the Iris, and MS. It usually finds the best local extrema for the single feature data set.
For the Iris data set experiments, there are more extrema found with random initializations and HCM. The GGA finds one of the best 2 extrema for all experiments on the Iris data set. In the majority of cases the best extrema is found despite the fact that different extrema and their associated partitions can be very close (e.g. the best two Iris partitions differ by exactly 1 feature vector's class assignment).
The exact choice of parameters may be determined on representative subsets of the data to be clustered. Alternatively, we have shown that an adaptive method [11] of setting crossover and mutation rates provides partitions that are very good and is almost as consistent in finding these partitions over multiple trials as the best choice of crossover and mutation operators allows.
Overall, initialization has a significant effect on the final partition. The GGA approach to clustering provides a viable way to avoid local extrema. However, it can take up to 2 orders of magnitude more time than FCM/HCM; in our experiments the GGA does take 2 orders of magnitude more time for all domains. Hence, in the same time as the GGA one could on our experimental data sets, for example, try out 100 random initializations of FCM/HCM and use the partition associated with the lowest J2/J1 value. This approach will provide equivalent results to the GGA approach in the same amount of time for all the domains studied here. In fact, one could halve the time to find the best known J2 or partition value by just using 50 random initializations for each domain.
The GGA approach seems to only have viability as a stand-alone cluster optimization procedure if its time cost can be reduced. However, even for the less time consuming real-valued GGA implementation the major cost is the evaluation of J2 per individual. Orders of magnitude speed decreases for the GGA approach do not seem likely.
A useful application of a modified GGA approach is as a framework for optimizing other functionals for partition generation, which can be compactly expressed, such as with a set of cluster centers, for which another optimization approach may not yet have been devised. The GGA produces good non-degenerate partitions which can be used to evaluate newly designed functionals.
Acknowledgements: This research was partially supported by the Whitaker Foundation and the National Cancer Institute (CA59 425-01). Amine Bensaid and Srinivas Boggavarapu contributed some code to the GA's used in the reported experiments.