Similarity Measures for Categorical Data
I'm starting to use CLUTO, but I have some questions about the similarity Measures that use to compare the Objects. The Cluto's manual indicate that the file .mat must only contain floating point numbers for the values of the columns.
I´m researching about computer security incidents and the data contain categorical values for all the columns. My question is: if I change the categorical values for integer values, only for vcluster could understand the file .mat. What's the similarity measure that i need to use for compare the objects like categorical data?
I really apreciate your answer, because Cluto is the best software for my research.
Thank you, and I hope your soon answer.
Universidad Distrital Francisco Jose de Caldas - COLOMBIA