Word Representations for NLP

by Joseph Turian, Lev Ratinov, and Yoshua Bengio.

Code and data for NLP word features. If you plug these word features into an existing supervised NLP systems, it might improve accuracy, according to Turian et al (2010).

If you just want word features to stick into your supervised NLP system and hopefully improve prediction, grab these Brown clusters (README) or these neural word embeddings (README).


Excerpted t-SNE visualization of neural word embeddings. Click on the image to see a larger version. As you can see, words that are syntactically and semantically related are closer together.

Summary

This code and data are based upon Turian et al (ACL, 2010) "Word representations: A simple and general method for semi-supervised learning", which supercedes our earlier NIPS workshop paper. You can get an introduction to the methodology by watching the video of my workshop talk, but the results presented in that talk have been superceded.

Support

Support is provided under a "Pay it forward" license: I'll give you support, but in turn you should clearly document the support I gave you, as well as other things you figured out yourself. You should then send that documentation upstread, to save the next user from the same stumbling block.

Neural language model (Collobert + Weston)

Code: Here is code for the neural language model. Use the revision from 2009-12-27. Later revisions were me adapting the neural language model code for inducing multilingual language models. I recommend you pull the git repo, but you can also download this .tar.gz code snapshot.

Data, with t-SNE visualizations: [ README | default hyperparameters.language-model.yaml ]

HLBL language model

Code: Andriy Mnih was kind enough to run his HLBL code on our data.

Data: [ README | 50 dim | 50 dim unscaled | 100 dim | 100 dim unscaled | scale-embeddings.py ]

Brown clusters

Code: Percy Liang's C++ implementation of Brown clustering, version 1.2.

Data: [ README | 100 classes | 320 classes | 1000 classes | 3200 classes ]

CRF Chunking with word representations

Code: Here is code for CRF chunking using word representations.

Perceptron NER with word representations

Code+Models: Go here to read about and download a package of the perceptron NER with word representations, code and models.

Random indexing word representations

Code: Here is code for inducing random indexing word representations. This code was NOT used in any of our published experiments, but is included for those who are curious.