3
2

I am new to this field. I am working on a recommender for a website where the major features are the tags of items uploaded. Tags are uncontrolled and done by users and each item can have multiple tags I was thinking of doing 1)related items using tags 2)tag recommendation for uploaded items

I was thinking of using term frequency/idf for finding related items using their tags. Second method maybe would be latent semantic analysis.

With regards to tag recommendation for uploaded items, which machine learning technique would be good and easier to implement?

asked Jul 13 '10 at 11:20

damoose10's gravatar image

damoose10
46123

edited Jul 14 '10 at 11:00

Joseph%20Turian's gravatar image

Joseph Turian ♦♦
579051125146

You need to mention what information is available about the new items to use as features for tag suggestion.

(Jul 13 '10 at 12:20) cityhall

That was my first question about the question as well - are we working with images, blog posts, links, or something else? The type of features available for extract + analysis would depend on media type.

You might also want to include some meta-data in your feature set such as parent thread ID, thread subject information, etc.

(Jul 13 '10 at 15:27) mfhughes

The items have metadata features which are web services endpoints. This can be too technical, so I was told not to add it at this time. Hence I would classify it as a document type for now.

Mostly users visit to download these files based on the item's descriptions and its tags which are all done by the uploader. Tags can be added by other users as well. It is a folksonomy/collaborative tagging.

So, the measurable features of the document are description, tags, no of views and downloads. Ratings and reviews are not much used.

(Jul 13 '10 at 19:49) damoose10

2 Answers:

Another approach is to model the tags using labeled lda. It works better than many independent classifiers for predicting tags, according to their paper.

answered Jul 13 '10 at 18:10

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

You can use tags as "extra" features (derived from the tag vocabulary or "bag of tags") along with features extracted with whatever information is available for the items. Take a look at the correspondence LDA paper ("Modeling Annotated Data" by Dave Blei & Mike Jordan) which is especially designed for topic modeling of multimodal data (e.g., image and text).

Such a topic based analysis would give you a semantic subspace based representation for the items which you can use for various purposes. For example, you can compare items in this semantic subspace for finding similar items. For the task of assigning tags to a new item, a simple approach would be to look for the semantically-most-similar item(s) and assign the same set of tags to the new item.

answered Jul 13 '10 at 15:55

spinxl39's gravatar image

spinxl39
3698114869

edited Jul 13 '10 at 16:00

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.