Dictionary Learning and Sparse Coding for Unsupervised Clustering
MINNESOTA UNIV MINNEAPOLIS INST FOR MATHEMATICS AND ITS APPLICATIONS
Pagination or Media Count:
A clustering framework within the sparse modeling and dictionary learning setting is introduced in this work. Instead of searching for the set of centroid that best fit the data, as in k-means type of approaches that model the data as distributions around discrete points, we optimize for a set of dictionaries, one for each cluster, for which the signals are best reconstructed in a sparse coding manner. Thereby, we are modeling the data as the of union of learned low dimensional subspaces, and data points associated to subspaces spanned by just a few atoms of the same learned dictionary are clustered together. Using learned dictionaries makes this method robust and well suited to handle large datasets. The proposed clustering algorithm uses a novel measurement for the quality of the sparse representation, inspired by the robustness of the lsub 1 regularization term in sparse coding. We first illustrate this measurement with examples on standard image and speech datasets in the supervised classification setting, showing with a simple approach its discriminative power and obtaining results comparable to the state-of-the-art. We then conclude with experiments for fully unsupervised clustering on extended standard datasets and texture images, obtaining excellent performance.
- Information Science