|
We know from Ryan McDonald's spanning tree work that we can think of dependency parsing as setting up edge potentials in a fully connected graph (modulo some cornercase-need to handle roots), and then running a maximum spanning tree algorithm to find the most likely parse for a sentence. What if, instead of desiring a most-likely parse, we want to generate samples from the corresponding distribution? Is there an obvious way to do this? I suspect that there is, because we know how to do it for constituent trees generated by a PCFG. Perhaps it's obvious and I'm just not seeing it in the dependency case... |
|
Hey Hal, Certainly for the projective case one can use this trick to convert an arc-factored model to a CFG and then, you should be able to run standard PCFG sampling techniques. This Wallach et al. paper also might help you. For the non-projective case it is going to be trickier, since the top down step of the Johnson et al. paper assumes projectivity. I do think there is something that can be done here (via the matrix-tree theorem, or something similar). This paper might be a good starting point. 1
Yeah, I can see how to do it in the projective case, either smartly (ala Hanna's paper) or dumbly (by resampling single edges at a time, ala Alexandre's comment, below), but I'm really more interested in the non-projective case. (This is for a broader class of problems than just dependency parsing.) I had also though about MT theorem -- that's basically what makes me believe that this should at least be possible.
(Sep 13 '10 at 11:22)
hal
|
|
I read the paper a while ago so am not sure on the details, but these guys Multilingual Dependency Parsing using Global Features, (Nakagawa 2007) used gibbs sampling for non-projective dependency parsing. |
|
This paper might solve your problem:
|