4
3

HMMs are widely used in bioinformatics, speech and NLP. On the other hand, when I read papers about RNNs, they are usually applied to synthetic data, or "tested" non-competitively, with no quantitative comparison to HMMs.

I understand the whole theoretical argument that, RNNs can, in principle, form distributed representations, and HMMs can not, but are there practical applications where RNNs are actually known to do better than HMMs?

asked Jan 08 '13 at 02:37

Max's gravatar image

Max
476162729


3 Answers:

Note that the restriction of HMMs pointed out by larsmans is only true when the component distributions of the HMM are multinomials. With log-linear component distributions, you can use features and thus potentially generalize to unseen words. The basic difference between an HMM and a CRF in this case is that the CRF can use features that take the whole input into account, even for transition features, which makes the CRF more powerful in the supervised case. The upshot of the HMM is that it can be used in an unsupervised setting as well.

I think that by introducing additional latent variables into an HMM you could in principle make it induce distributed representations as well, but this will make inference much more complicated.

Personally, I tend to think of graphical models, such as HMMs and CRFs, as more flexible than neural networks. In the former the graph structure is adapted to specific data instances, while the structure of a NN is fixed (I guess that this picture is more complicated when the NN is recurrent).

answered Jan 08 '13 at 07:21

Oscar%20T%C3%A4ckstr%C3%B6m's gravatar image

Oscar Täckström
2039133450

edited Jan 08 '13 at 07:23

Interesting, I didn't know that. Nor do many people in NLP, apparently, given the ugly hacks that have been devised to make HMMs handle unseen samples. As for recurrent NNs, they can be built with pretty much arbitrary architectures and still be trained with variants of backprop.

(Jan 08 '13 at 15:12) larsmans

One thing where RNNs really seem to shine is the learning of longer term dependencies. The text generator by Ilya Sutskever is able to generate text with matching opening and closing brackets with many characters in between. For an HMMs this would require an incredibly large number of hidden states.

(Jan 09 '13 at 05:21) Philemon Brakel

larsmans, although this is a pretty old idea, it was popularized in NLP by this paper http://www.denero.org/content/pubs/naacl10_berg_painless.pdf . Using this for unsupervised and weakly supervised learning seems to give substantial improvements over a vanilla HMM.

(Jan 09 '13 at 06:10) Oscar Täckström

Philemon, interesting, do you have a sense of how much more computationally demanding an RNN is for this task compared to an HMM with similar performance?

(Jan 09 '13 at 06:12) Oscar Täckström

Prediction with an RNN is extremly efficient: it's just a number of matrix multiplications and additions linear in the number of time steps of your sequence. Training is more sophisticated however, and can take rather long. The error landscape of RNNs can be discontinuous and have very long plateaus.

(Jan 10 '13 at 10:44) Justin Bayer

RNNs are the state of the art at modeling symbolic sequences of polyphonic music, performing much better than HMMs on this task.

This is because with an exponential number of note patterns ("chords"), discrete HMMs have a hard time to generalize well even with smoothing and back-off techniques. Independent binary N-grams for each note would eliminate this problem, but would ignore simultaneous note correlations (harmony). Also, RNNs can be trained using HF optimization, and can replace the usual prediction layer with more powerful output probability models (e.g. RBM or NADE).

RNNs can also perform polyphonic transcription better than competing methods. The RNN-RBM achieves temporal smoothing much better than the popular HMM approach.

answered Feb 10 '13 at 00:19

boulanni's gravatar image

boulanni
12133

Juergen schmidhuber's page links to a few. It seems like [joined up] handwriting recognition was quite successful...

http://sourceforge.net/apps/mediawiki/rnnl/index.php?title=Main_Page

http://www.idsia.ch/~juergen/rnn.html.

answered Jan 10 '13 at 04:25

SeanV's gravatar image

SeanV
33629

http://www.cs.toronto.edu/~graves/arabic_ocr_chapter.pdf

(Feb 10 '13 at 03:30) gdahl ♦
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.