|
Hi, I am using CRFs for transliteration. Earlier , I used CRF++ but realize that it takes a lot of time for training on larger datasets and also that it is trained using MLE approach, while it has been proved that there are better approaches to train a CRF. Then i came across CrfSgd but CrfSgd doesn't have an option to return top-20 results( the way crf++ does) which i desperately need in the case of transliteration( as the first answer is not always right ) . I want to write some code from scratch to train a log-linear CRF using SGD and then find the top-20 the way crf++ does but do not have an idea on where and how to start :( . [Obviously, its not advisable to re-invent the wheel but i see it as a good oppurtunity to also learn how to implement a popular learning method] . I use Python for everything . |
|
Getting the top K results from a linear-chain CRF is a reasonable simple modification of the decoding (viterbi search) step of classification. The idea is that, for each node in the chain, instead of keeping the best score for each possible value of that node and the previous (possibly more than one) node, you keep the K best values. Then at the last step you no longer search for the single best among all but search for the K best. I think it's probably easier to hack this into CRFsgd than implementing your own CRFs in python. If you do want to implement your own CRFs in python you need, essentially:
Plus, you need to think of a way of representing the very big, very sparse feature vectors that appear when one uses CRFs and a way of representing the topology of the CRF (assuming you're not using a single-feature linear-chain CRF). Yeah, earlier, the idea was to see CrfSgd code and extend it to work for the transliteration problem. But, then i want to have the practical experience of implementing something from scratch and thats why am eager to do this. ( its an itch which will hopefully have a logical and happy ending :) ) . Thanks a lot for your answer. I need to get started fast as it does sound like quite a bit of work :)
(Jan 17 '11 at 14:09)
crazyaboutliv
|
|
Actually, you don't need to implement the entire thing in order to get the behavior you want. You could keep using CrfSgd (or any other optimizer) for fitting the parameters. Then, you will have to understand the file format of the learned model, this is usually a quite-readable text file. Alternatively, you could tweak the code to dump the model in a convenient format. Once you have the parameter values, all you need to do is implement the viterbi decoding (+ kbest), using the learned parameters. This is much easier to do than implementing the training part. (Alternatively, it might be even easier to just write a code to transform the trained model from the CrfSgd optimizer to that of CRF++ or whatever you were using). You could also check the CRF implementations available in Mallet or Lingpipe. Thanks. I will observe the file formats and see how this can be done . Thanks.
(Jan 18 '11 at 01:07)
crazyaboutliv
|