Hello, I am newbie to ML. I have a question from scratch.

I knew that, PCA is used to best represent the data. In contrast LDA is used to best discriminant the data.

I came to know that, we can classify data between classes using PCA.

How PCA is useful for classification? I mean, how can we set decision boundaries (functions) in PCA ?

Thanks for your answer.

asked Feb 25 '12 at 08:16

James%20Deen's gravatar image

James Deen
15335

edited Feb 25 '12 at 12:05

ogrisel's gravatar image

ogrisel
498995591


4 Answers:

As others have noted, PCA is primarily for dimensionality reduction/preprocessing/feature extraction.

Probabilistic PCA, however, is basically a generative model: it models the density of the data. Thus, you can use it in a generative classification approach: For each class c, fit p(x|c) by pPCA. If you have a new x' you want to know the class of, return the c for which p(x'|c) is highest.

This will in the end lead to a linear classifier and will thus not necessarily be better than logistic regression or a linear SVM. It is just a different way to estimate the parameters. The only reason to use it is if you can easily express prior knowledge with it.

answered Oct 12 '12 at 07:20

Justin%20Bayer's gravatar image

Justin Bayer
170693045

PCA can indeed be used for classification by using the SIMCA approach, but it is rather used for predicting the class of new observations. Here is a description of how this is done:

http://www.camo.com/resources/simca.html

answered Oct 11 '12 at 06:18

Oliver%20T's gravatar image

Oliver T
164

No. PCA is an unsupervised algorithm. It does not learn decision boundaries. It can however be useful to reduce a dataset before applying a supervised learner.

answered Feb 25 '12 at 09:48

Gael%20Varoquaux's gravatar image

Gael Varoquaux
92141426

Fitting a PCA model is an unsupervised process. However it can be used as a preprocessing step for a classifier (usually non-linear) so as to reduce the dimensionality of the data. See for instance this simplistic example pipeline for face recognition with eigenfaces and RBF support vector machines.

Edit: Actually I have found another way to do "classification with PCA" in this talk by Stéphane Mallat: each class is approximated by a affine manifold with the first component as direction and the centroid as offset and new samples are classified by measuring distance to the nearest manifold with an orthogonal projection.

Talk: https://www.youtube.com/watch?v=lFJ7KdSdy0k (very interesting for CV people)

Related papers: http://www.cmap.polytechnique.fr/scattering/

This is obviously makes a lot of assumption of the geometry of the classes but its a very interesting alternative to k-NN if you have the good feature representation.

answered Feb 25 '12 at 09:41

ogrisel's gravatar image

ogrisel
498995591

edited Feb 28 '12 at 08:18

Funny how we say pretty much the same thing :). I wrote my answer in parallel to yours and didn't get to see it before I had finished.

(Feb 25 '12 at 10:05) Gael Varoquaux
1

When I saw the question in my twitter feed I jumped on my horse to fight the supervised-PCA meme once and for all. I am glad you joined in the battle :) I wonder where this myth that one can train PCA and k-means clustering in a supervised setting come from.

(Feb 25 '12 at 12:00) ogrisel

I don't know but it keeps coming up, alongside with ICA in my field :$

(Feb 25 '12 at 12:44) Gael Varoquaux
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.