Current version: 2.1.2, 10/18/06
CLUTO is a software package for clustering low- and high-dimensional datasets and for analyzing the characteristics of the various clusters. CLUTO is well-suited for clustering data sets arising in many diverse application areas including information retrieval, customer purchasing transactions, web, GIS, science, and biology.
CLUTO's distribution consists of both stand-alone programs and a library via which an application program can access directly the various clustering and analysis algorithms implemented in CLUTO.
- Multiple classes of clustering algorithms:
- partitional, agglomerative, & graph-partitioning based.
- Multiple similarity/distance functions:
- Euclidean distance, cosine, correlation coefficient, extended Jaccard, user-defined.
- Numerous novel clustering criterion functions and agglomerative merging schemes.
- Traditional agglomerative merging schemes:
- single-link, complete-link, UPGMA
- Extensive cluster visualization capabilities and output options:
- postscript, SVG, gif, xfig, etc.
- Multiple methods for effectively summarizing the clusters:
- most descriptive and discriminating dimensions, cliques, and frequent itemsets.
- Can scale to very large datasets containing hundreds of thousands of objects and tens of thousands of dimensions.