One thing you can do is a bit of degrading of your inputs. For example, if "t"s and "d"s are usually confused by your transcribing software, replace both of them by an arbitrary symbol. In the same way, if a letter is usually dropped (say a mute "g" in the end of a word) you can remove it from other places where it appears. If some words are mistaken, use a single feature for them, etc. I'm not sure that high regularization is the way to go, this depends a lot more on what sort of classification you're going to make than on the quality of your features.