Question: What, if any, existing research relates to the current strategy I'm working on? I'm hoping to not re-invent the wheel or make the same mistakes others may have.

The strategy: I'm creating a topic analysis solution for customer support emails. The problem is that the current topic analysis solution I've used (LDA using MALLET) is not great at picking out common topics: the results I get are mostly noise. The current strategy I'm developing is to use website content as a 'seed'. For example, product names, features, etc. for that appear in the emails customers write are common in a website, so frequently occuring terms on the website could be a good source of 'training' or 'seed' data for any topic analysis solution I create.

Thanks, Dave

asked Sep 29 '11 at 13:14

Dave%20Trindall's gravatar image

Dave Trindall

One Answer:

A previous question about semi-supervised LDA had some different approaches for "seeding" topics that you might find useful.

answered Nov 24 '11 at 17:23

David%20Andrzejewski's gravatar image

David Andrzejewski

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.