2
1

Hey!

I am currently doing a short text classification task and I have two types of features:

1) Bag-of-Words Tfidf features

2) Topic Model LDA features

Each on it's own is performing pretty well (BOW better than LDA), but now the idea came up to combine both feature groups to improve the classification.

My BoW Tfidf features are L2 normalized and the LDA features sum to 1.

I have now simply combined the feature by concatenating them. So:

sample1 #tfidf1 #tfidf2 ..... #LDA1 ... #LDA500..

My Baseline for this approach is the best performing classification task for basic BoW, which in my case is a linear SVC.

With the combination I perform unfortunately a little worse than the Baseline. If I switch to multinomial Naive Bayes I can outperform the Baseline conducted as well with MNB, but this Baseline is worse than the SVC Baseline.

Maybe someone has some ideas, how I can do better. Maybe this simple combination approach is not the best anyways.

Thanks and many Regards, Philipp

asked Oct 01 '12 at 07:06

ph_singer's gravatar image

ph_singer
46335

edited Oct 01 '12 at 10:04

did you rescale each feature back [0;1] or [-1;1] before testing?

(Oct 03 '12 at 16:55) Mikhail
1

I believe @Alex Passos has some insight into this problem.

(Oct 04 '12 at 11:11) Joseph Turian ♦♦
1

See here: http://metaoptimize.com/qa/questions/5745/curious-how-does-lda-work-in-seach-queries#5752

(Oct 04 '12 at 11:29) Joseph Turian ♦♦
2

In short, you should probably experiment with: a) different scaling methods, particularly over BOW, as proposed by @Mikhail. b) A hyperparameter controlling the relative weight of the LDA features.

(Oct 04 '12 at 11:31) Joseph Turian ♦♦

Turian, can you repost your comment as an answer? I don't have much to add to this question that I didn't say in the other ones on LDA.

(Oct 04 '12 at 13:42) Alexandre Passos ♦

One Answer:

See the answer by Alexandre Passos here: http://metaoptimize.com/qa/questions/5745/curious-how-does-lda-work-in-seach-queries#5752

In short, you should probably experiment with:

  1. different scaling methods, particularly over BOW, as proposed by @Mikhail.
  2. A hyperparameter controlling the relative weight of the LDA features.

answered Oct 05 '12 at 02:06

Joseph%20Turian's gravatar image

Joseph Turian ♦♦
579051125146

Hey! Thanks for the answer.

I don't get a lot of insight from the other post regarding my problem.

Ad 1.) So for example i could try L1 normalization over the BoW features instead of L2 normalization? Or over the final set of features another second scaling method?

Ad 2.) Could you elaborate this in a greater detail?

Thanks a lot, Philipp

(Oct 08 '12 at 05:41) ph_singer
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.