|
Hi, I'm using a structural svm to learn the parameters of a graphical model with pairwise and higher order potentials. When generating negative examples for the svm learning I perform loss augmented inference (using message passing algorithm to find the MAP solution) which amounts to finding a parameter setting the is highly likely AND has high loss. I am wondering what the best way to trade off these two desires is. I tend to always find examples that have a very high loss, but are not very probable. Naively, I have tried to scale down the contribution the loss function is having on MAP inference, but I am wondering if there is some standard way to deal with this problem? I haven't been able to find and details about this. I thought that as the weights of the model were increased, the loss function would have less of an effect on the loss augmented inference, but they never seem to get large enough to make a great difference. I am hoping this is a common problem, and not an indication that there's something wrong in my code! Any thoughts or re-directions to papers covering this topic would be greatly appreciated! |