Similarity functions corr vs. cos
Overall, cluto seems to be very useful! I noticed a huge runtime difference between -sim=cos and -sim=corr options when using vcluster. Although correlation and cosine of two vectors are same, why is there such a big difference? Are you using any implementation tricks, which optimizes computation of cosine? (There is a note for this in the manual, but it does not answer my question directly).
Sincerely,
Submitted by Anonymous on Tue, 2006-12-19 16:09
»
- Login to post comments
RE: The correlation
The correlation coefficient-based implementation does not take advantage of sparse vectors.