|
In the paper "Gradient-Based Learning Applied to Document Recognition" of Yann LeCun, Bottou, Bengio and Haffner, it is proposed the use of SDNN (Spatial Displacement Neural Networks) for convnet for recognizing hand-written characters. The pure results of the slided recognizers would present some errors, for instance multiple recognition of the same character due to the translation invariant nature of convolutional neural networks. Moreover a sub-characters may be recognized as characters, e.g. the right part of a 4 may be confused with a 1. Transducer are proposed to addresses those issues, in particular 2 types: character model transducers and language model transducers. The higher level transducer is a language model transducer, that ensures that the recognized world is lexically correct according to a grammar. I do not think that a simple grammar transducer can address entirely the problem of replicated recognition instances of the same character, since repetitions of letters may happen in natural language, so it may be allowed by the grammar. Not much is said about the character model transducer, but I guess it is responsible to avoid repeated instance recognitions of the same character. Does anybody have an idea of how a character model transducer can be implemented? |