questions on the cluster centroids


Since the number of my data is huge, I only selected part of the data for clustering, and I need to assign the remaining data to the clusters by calculating their distances to each cluster centroids. I wonder if CLUTO provides any ways of doing that. It seems to me that CLUTO cannot output the cluster centroids, which means I have to calculate the centroids from the data that have been clustered. And I also need to compute the distance between feature vectors myself. Any comment or suggestions on that?


RE: Cluster Centroids

I have a need similar to that expressed in the comment by Jun (7/24/2006) in that I am looking for a way to print out the final cluster centroids and/or mean vectors. I assume they could be calculated based on the cluster assignments but I would hate to do that given that Cluto has already calculated them it just isn't outputing them (or I haven't been able to figure out how to make it do that). Any advice?


RE: How to get cluster centroids

Even though CLUTO does not provide a routine to explicitly give you the cluster centroids, you can still get them from it by using the show-features routine and specifying as the number of features the number of dimensions in your dataset. The weights associated with the descriptive features are very close to the weights of these features in the centroids.