|
If you were to implement an MT system today, which model would you pick? In other words, which MT model gives the best translation results? Still phrase-based ones like Moses? GHKM? Quirk's dependency treelets? Hiero? Syntax on the target side, on the source side, on both sides? |
|
I am a believer of at least target-side syntax, but I think it doesn't make much sense to talk about "a model producing the best results" in the context of MT. Translations models are tweaked as hell, and there are tons of things contributing to the performance of a good translation system beyond the base model. |
|
I agree with @yoavg. There are lots of things that contribute to the performance of a good MT system. I would personally start with syntax on source side (preferably dependency based) with some sort of tree transformations to produce target side trees. I would add as many parallel resources as possible in the form of translation memory, bilingual dictionaries, parallel treebanks. And as I am a believer of the rule-based systems, I would add some seeds of hand-written rules to places where the machine learning systems fails. Or just use ML for only those parts where cleverly devised rules fail to tackle the linguistic variations. |