|
Are there any efficient ways to do structure learning in an online fashion? All the papers I find on structured output prediction just assume you can quickly find the optimal configuration. Am I suppose to just run a greedy algorithm at every time step as I update my parameters? |
|
You should read the SampleRank paper. It's a simple algorithm that does very fast structured learning. It's not online, but it can train (say) a trigram POS tagger on wall-street-journal data in a few minutes, because it updates the parameters as you gibbs sample around a solution. |
|
Did you check this paper by Murphy, he is pretty good with algorithms to do fast inference on Bayesian Nets using prior structures over the possible graph spaces so you do not do just random changes and hope for the best. |
Do you know "(Online) Subgradient Methods for Structured Prediction" http://repository.cmu.edu/cgi/viewcontent.cgi?article=1054&context=robotics? Not sure if it will help you, though.
I am not sure I understood you question: what does the online setting have to do with finding the argmax for a single example? And where do you need a greedy algorithm?
Andreas: there are two senses in which online can be understood in structured prediction. One is the one you mention: you get one input and one structured output at a time, make a prediction, suffer a loss, move on. What zaxtax seems to be talking about, however, is a setting where you can make updates before inference has converged, which can be useful if inference is intractable or just expensive. This sense is the one in which samplerank can help.
That is a weird meaning of "online". zaxtax: What kind of objective are you interested in? Maximum margin or maximum likelihood? What does your inference problem look like and what kind of inference algorithm are you using?
Usually you use the BIC score, Maximum likelihood is not a very good estimator for optimal graph structure
What's the correct terminology to clarify that this is a setting where inference is ridiculously intractable?