|
Hello, I am newbie to ML. I have a question from scratch. I knew that, PCA is used to best represent the data. In contrast LDA is used to best discriminant the data. I came to know that, we can classify data between classes using PCA. How PCA is useful for classification? I mean, how can we set decision boundaries (functions) in PCA ? Thanks for your answer. |
|
As others have noted, PCA is primarily for dimensionality reduction/preprocessing/feature extraction. Probabilistic PCA, however, is basically a generative model: it models the density of the data. Thus, you can use it in a generative classification approach: For each class c, fit p(x|c) by pPCA. If you have a new x' you want to know the class of, return the c for which p(x'|c) is highest. This will in the end lead to a linear classifier and will thus not necessarily be better than logistic regression or a linear SVM. It is just a different way to estimate the parameters. The only reason to use it is if you can easily express prior knowledge with it. |
|
PCA can indeed be used for classification by using the SIMCA approach, but it is rather used for predicting the class of new observations. Here is a description of how this is done: http://www.camo.com/resources/simca.html |
|
No. PCA is an unsupervised algorithm. It does not learn decision boundaries. It can however be useful to reduce a dataset before applying a supervised learner. |
|
Fitting a PCA model is an unsupervised process. However it can be used as a preprocessing step for a classifier (usually non-linear) so as to reduce the dimensionality of the data. See for instance this simplistic example pipeline for face recognition with eigenfaces and RBF support vector machines. Edit: Actually I have found another way to do "classification with PCA" in this talk by Stéphane Mallat: each class is approximated by a affine manifold with the first component as direction and the centroid as offset and new samples are classified by measuring distance to the nearest manifold with an orthogonal projection. Talk: https://www.youtube.com/watch?v=lFJ7KdSdy0k (very interesting for CV people) Related papers: http://www.cmap.polytechnique.fr/scattering/ This is obviously makes a lot of assumption of the geometry of the classes but its a very interesting alternative to k-NN if you have the good feature representation. Funny how we say pretty much the same thing :). I wrote my answer in parallel to yours and didn't get to see it before I had finished.
(Feb 25 '12 at 10:05)
Gael Varoquaux
1
When I saw the question in my twitter feed I jumped on my horse to fight the supervised-PCA meme once and for all. I am glad you joined in the battle :) I wonder where this myth that one can train PCA and k-means clustering in a supervised setting come from.
(Feb 25 '12 at 12:00)
ogrisel
I don't know but it keeps coming up, alongside with ICA in my field :$
(Feb 25 '12 at 12:44)
Gael Varoquaux
|