Revision history[back]
click to hide/show revision 1
Revision n. 1

Jul 23 '10 at 20:05

Alexandre%20Passos's gravatar image

Alexandre Passos
1899744214335

One thing you can do is a bit of degrading of your inputs. For example, if "t"s and "d"s are usually confused by your transcribing software, replace both of them by an arbitrary symbol. In the same way, if a letter is usually dropped (say a mute "g" in the end of a word) you can remove it from other places where it appears. If some words are mistaken, use a single feature for them, etc. I'm not sure that high regularization is the way to go, this depends a lot more on what sort of classification you're going to make than on the quality of your features.

click to hide/show revision 2
Revision n. 2

Jul 23 '10 at 20:07

Alexandre%20Passos's gravatar image

Alexandre Passos
1899744214335

One thing you can do is a bit of degrading of your inputs. For example, if "t"s and "d"s are usually confused by your transcribing software, replace both of them by an arbitrary symbol. In the same way, if a letter is usually dropped (say a mute "g" in the end of a word) you can remove it from other places where it appears. If some words are mistaken, use a single feature for them, etc. I'm not sure that high regularization transfer learning is the way to go, this depends since you have corrupted features, and you would have to find a lot more on what sort larger data set of classification you're going to make than on the quality of labeled examples for your features.specific problems, which are not always available.

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.