Is there a computational methods to determine the amount of compression rate in PCA due to features dimension & count?

In the first n cols for reduction then perform reproject on one percentage of the data after it compute the difference with original data .If the validation is greater than the specified threshold select n cols else increase n and repeat loop.

Is this true? is there a better solution?

asked Aug 02 '13 at 11:36

Mostafa%20Sataki's gravatar image

Mostafa Sataki
1334

edited Aug 02 '13 at 11:46


One Answer:

To determine the compression amount, you have to decide how much of the variance you want to retain. A common method is to sort the eigenvalues in descending order (most implementations of PCA already do this for you). Then, if you compute the vector of cumulative sums and normalize that vector by dividing it by the total sum, the elements of that vector indicate the fraction of total variance retained by keeping that many eigenvectors. For example:

fractions = cumsum(eigenvalues) / sum(eigenvalues)

fractions[i] indicates the amount of variance retained by keeping i eigenvectors (if your indexing starts at 1). So if you wanted to retain 0.99 of the data variance, the number of eigenvectors to retain is the smallest value of i such that fractions[i] >= 0.99.

The compression is just N / i, where N is the original number of dimensions.

answered Aug 02 '13 at 12:39

bogatron's gravatar image

bogatron
471156

to clarify bogatron's answer: the mean square reconstruction error is given by the sum of the eigenvalues (variance) corresponding to the eigenvectors you reject. So you don't calculate the reconstruction error by repeatedly projecting onto each 1 to ith eigenvectors, you have the mean square recontruction error for every possible choice of principal components by just looking at the eigenvalues.

(Aug 04 '13 at 19:21) SeanV
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.