What are the current state of the art methods for Unsupervised WSD?

asked Dec 19 '11 at 23:57

Ehsan%20Khoddam%20Mohammadi's gravatar image

Ehsan Khoddam Mohammadi
1413412


3 Answers:

Are you interested in systems that label contexts using a predefined set of senses or one that learnes the senses automatically and then labels the contexts? People typically refer to the first task as Unsupervised Word Sense Disambiguation and the second as Word Sense Induction.

For Unsupervised Word Sense Disambiguation, the best models seem to use graph based methods. Personalizing Page Rank for Word Sense Disambiguation is one of the best places to start. Others have improved upon this by doing some novel weighting of the edges in the graph or filtering nodes, but this model does 90% of the work. Another good approach has been to enrich your knowledge base with a lot of additional relations as done here. A word of caution though, their results are superb but I've attempted to reimplement their method and it's beeen somewhat troublesome.

For Word Sense Induction, graph based methods seemt o do really well. This model builds co-occurrence graphs from contexts, does some really nifty and simple filtering, and then clusters the graph using a somewhat complicated random walk algorithm. You can simplify it a little bit by finding nodes with high PageRank scores and still get decent performance.

Alexandre makes a good point for the WSI, the basic approach is to form a large set of sample sentences into vectors and cluster them. The shared task he mentions does a reasonable job of laying out the different approaches but the metrics can be somewhat misleading.

answered Dec 20 '11 at 19:26

Keith%20Stevens's gravatar image

Keith Stevens
4363820

In 2010 there was the semeval word sense induction shared task, which seems to provide the latest in terms of dataset and evaluation metrics.

The following ACL 2011 paper, Latent Semantic Word Sense Induction and Disambiguation, makes a small literature review and presents some relevant numbers for their system and others, and those numbers should be trusted due to the recency of the task.

As far as I can tell the main approach for unsupervised word sense disambiguation is still to cluster (word type,context) pairs into some latent space, and it works pretty well as long as there is a lot of data. A really interesting technique is to use parallel corpora to align word tokens and see which tokens of the same type map consistently to different types in the target language. Indeed, at school one of the hints I was given as to when to use "por que" instead of "porque" in portuguese was that the first translates to "why" while the latter translates to "because".

answered Dec 20 '11 at 15:06

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
1899744214335

edited Dec 20 '11 at 15:08

Have a look at section 5 of this paper, it uses a supervised learning algo, but it learns from pre-tagged wikipedia data. so you'll be able to do a good job of disambiguating words given some context but restricted to stuff that have wiki-pages.

answered Dec 21 '11 at 17:30

bronzebeard's gravatar image

bronzebeard
31113

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.