|
I have a 2-state model and train it offline. Then I get real time observations, keep a rolling window and do the decoding over the emissions included. However, I get a lot of abrupt jumps in the outcome (the probability distribution of the states). Why can cause this? Is the emission alphabet too large perhaps? After the training stage, the emission probabilities over the 2 states seem to differ by much more than they should. Is this due to initialization problem, or perhaps my training interval is too small? Where can I seek the source of the problem? I used matlab's hmmtrain() and hmmdecode(). Thank you |
|
If you really believe your data should not be "jumpy", I think the best way to capture this is in your prior on the transition matrix. You estimate your transition probabilities from (partial) counts in your transition count matrix. To add a prior of "inertia", add counts to the diagonal before training. Boosting the diagonal counts boosts the probability of self-transitioning, i.e. keeping the same state at time t as t-1. The bigger the count you add to the diagonal, the stronger your prior and the less you will "trust" the local search in your unsupervised training. If you want to get really fancy, you can try posterior regularization of the transition matrix. |
|
Are you using supervised or unsupervised training? If unsupervised, it might just be that your data is much better explained by a jumpy model than by a persistent one (for example, observations which are mixed from two clusters are really well-explained by a "jumpy" model). If this is true, why do you think your model should be less jumpy? Is there some meaning you're expecting these HMM states to have that they are not having? Which properties of this meaning match and which don't match with the assumptions in hidden Markov models? Regardless, if you think more weight should be given to the transition probabilities feel free to exponentiate them and renormalize. If you're using supervised training, then it might be that you'd be better off using a CRF than an HMM, as it seems like the emissions are outweighing the transitions significantly, and CRFs can properly weight these two factors. It is unsupervised. Yes, the outcome has certain physical meaning that I do not expect to change so rapidly. The changes are in the right direction, but they are too abrupt. When I apply exponentially moving average(ema) on the outcome, it looks much more satisfactory, but this is just a hack, not a satisfactory solution. Any suggestions what can I do about it to solve the problem more cleanly and methodically?
(Mar 07 '12 at 18:06)
Viktor Simjanoski
I also believe that your emissions probability values have significant influence. You might try putting a prior on your emission distribution. Alternatively, revisit the independence assumption you made on your observations. For example, a strong assumption on full independence between observation variables tends to give either very high or low emission probability.
(Mar 08 '12 at 05:42)
Christopher Tay
Will decreasing the emission alphabet size make the model less jumpy?
(Mar 08 '12 at 15:14)
Viktor Simjanoski
|
The transition probabilities look like 9.1357e-01 8.6425e-02 7.9794e-02 9.2021e-01 so I really have no idea what's the source of this instability. Any clues?
do you expect the model to be "not-jumpy"? you could add a "inertia prior" by boosting the prior prob on self transitions.
Yes, I expect it to be much less jumpy than it is. Even though the time window includes around 2,000 observations, even a single one is able to make it go from [0.9 0.1] to [0.1 0.9]. Can you please elaborate on your suggestion of 'inertia prior'? How do I exactly use it? Where can I learn more about this? Thanks