From what i understanding, one of the Spectral clustering method use dimensionality reduction step using the eigenvector corresponding to the k largest eigen value, followed by a standard clustering step such as k-mean to cluster the points.

My question is does it make sense to use the same process with a different dimensionality reduction technique such as t-SNE? http://homepage.tudelft.nl/19j49/t-SNE.html

asked Jul 10 '11 at 14:12

maxime%20caron's gravatar image

maxime caron
51447


2 Answers:

Adding to Laurens's answer, one "problem" in machine learning is that pretty much anything that sounds sensible that you can try will lead you a result that if you squint hard enough can make some sense.

Indeed, the eigenvectors of the graph laplacian, in the limit of perfectly nice data, will split connected components apart (see the Spectral Clustering, analysis and an algorithm paper by Ng, Jordan, and Weiss for a proof of why this is so, or the longer survey by Ulrikke von Luxburg for other interpretations). The K-means is usually there only as a way to solve the soft clustering problem when your graph is not so nice.

Note that what really matters there is the clustering, and not the dimensionality reduction (as the vectors you get from the laplacian are supposedly nearly clustered already), so I have no idea why you would want to use t-SNE on that, as t-SNE does not give you a clustering of the data.

answered Jul 10 '11 at 16:41

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

why do you say that the vector you get from the laplacian are nearly clustered already but it`s not the case for other dimensionality reduction technique such as t-sne

(Jul 10 '11 at 18:51) maxime caron
2

Have a look at the cost function that Laplacian Eigenmaps is minimizing: sum_{i,j} w_{ij} * || y_i - y_j ||^2 subject to a norm constraint on the columns of Y (i.e., the L2-norm of each column of Y is 1), where the weights w measure the similarity between two points. This cost function very much likes to collapse points. In fact, the only reason that not all points collapse is the norm constraint on Y. Hence, Laplacian Eigenmaps will collapse similar points (that have high w_{ij}) and satisfy the covariance constraint with dissimilar points (i.e., it will place those far apart).

SNE is in fact very similar to Laplacian Eigenmaps, but has a much more elaborate covariance constraint on Y. This makes it better suited for dimension reduction/visualization, but probably less suited for clustering.

(Jul 11 '11 at 04:44) Laurens van der Maaten

You can always do this of course (and it probably wouldn't work so bad if you do it right), but the question is what you are then really doing, i.e., whether you can give an interpretation of the resulting algorithm.

Spectral clustering in its most basic form does not perform clustering on the graph Laplacian eigenvectors, but it simply looks at the sign of the elements in the eigenvectors to determine the cluster assignments. This can be interpreted as solving a relaxation of the normalized-cut problem on a local neighborhood graph. Running k-means or the like on t-SNE or other embeddings has no such interpretation that I know of (yet); although I should note there has been a study that uses LLE as the basis for spectral clustering.

More popular variants of spectral clustering indeed perform k-means on the graph Laplacian eigenvectors, which has some advantages: your number of clusters does not need to be a power of 2 and you typically get somewhat better results. (Afaik, this sort of destroys the interpretation in terms of the normalized-cut problem.)

answered Jul 10 '11 at 15:39

Laurens%20van%20der%20Maaten's gravatar image

Laurens van der Maaten
100651324

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.