# What is the formula for the calculation of the external similarity?

I wonder what is the exact formula of the calculation of the average external similarity? I found in CLUTO manual that the external similarity stands for "the average similarity of the objects of each cluster and the rest of the objects" but I have trouble to udnerstand what is the exact mathematical formula behind it. If you could explain it for the case of the repeated bisecting algorithm with cosine as a similaity function and the I2 as an objective clustering criterion function I will really appreciate! Or may be you could suggest where I could read about?

Thank you in advance.

Submitted by tanita on Sun, 2007-03-18 20:58

»

- Login to post comments

## RE: The exact formula depends on

The exact formula depends on the similarity function that you use. If you use the default, which is cosine, then the average similarity is nothing more than the sum of the cosine similarities between the objects in a cluster to the rest of the objects, divided by k(n-k), where k is the number of objects in the cluster, and n is the total number of objects that you are clustering.

george