Revision history[back]
click to hide/show revision 1
Revision n. 1

Nov 18 '10 at 04:48

spinxl39's gravatar image

spinxl39
3653114869

As you say that both set of features (train and test) have the same underlying latent variables then I think one way would be to train a factor analysis model (with the same number of factors, say K) separately on both the training and the test data. Once you have the new representations of the training and test data (in terms of the K latent factors), you can learn a model from the training data and apply it on the test data.

click to hide/show revision 2
Revision n. 2

Nov 18 '10 at 11:51

spinxl39's gravatar image

spinxl39
3653114869

As you say that both set of features (train and test) have the same underlying latent variables then I think one way would be to train a factor analysis model (with the same number of factors, say K) separately on both the training and the test data. Once you have the new representations of the training and test data (in terms of the K latent factors), you can learn a model from the training data and apply it on the test data.

Edit: As Alexandre pointed out, the factor analysis approach I suggested above actually wouldn't do the right thing in this case due to the identifiability issue in factor analysis.

click to hide/show revision 3
Revision n. 3

Nov 18 '10 at 11:59

spinxl39's gravatar image

spinxl39
3653114869

As you say that both set of features (train and test) have the same underlying latent variables then I think one way would be to train a factor analysis model (with the same number of factors, say K) separately on both the training and the test data. Once you have the new representations of the training and test data (in terms of the K latent factors), you can learn a model from the training data and apply it on the test data.

Edit: As Alexandre pointed out, the factor analysis approach I suggested above actually wouldn't do the right thing in this case due to the identifiability issue in factor analysis.analysis. One hack that you might try is to cluster the test data features (128 in number) into 16 clusters (i.e., the number of features in the training data). Then pick each cluster center as a feature for the test data which would give a new feature representation for the test data.

click to hide/show revision 4
Revision n. 4

Nov 18 '10 at 12:13

spinxl39's gravatar image

spinxl39
3653114869

As you say that both set of features (train and test) have the same underlying latent variables then I think one way would be to train a factor analysis model (with the same number of factors, say K) separately on both the training and the test data. Once you have the new representations of the training and test data (in terms of the K latent factors), you can learn a model from the training data and apply it on the test data.

Edit: As Alexandre pointed out, the factor analysis approach I suggested above actually wouldn't do the right thing in this case due to the identifiability issue in factor analysis. One hack that you might try is to cluster the test data features (128 in number) into 16 clusters (i.e., the number of features in the training data). Then pick each cluster center as a feature for the test data which would give a new feature representation for the test data.data. Another possibility could be to use something like the weakly paired maximum covariance analysis on the training and the test data which is a multimodal dimensionality reduction technique.. kind of like canonical correlation analysis (CCA) but does not require matchings between pair of examples in the two datasets (and the number of examples could be different in both datasets, unlike CCA).

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.