|
What are good hashing kernels? Would it be possible to use rate of collusion to select a set of hashing kernels? Or the hashing kernels need to be locality sensitive |
|
Usually people use murmurhash or some look-alike as the hashing function because it's fast and easy to build many independent hash functions out of. Optimizing hashing functions is hard and kind of pointless, as you really want them to be independent to handle collisions. What kinds of thing you're hashing (say words, bigrams, or character n-grams, or more complex features), on the other hand can and should be optimized for using whatever you do (ablation, feature selection, etc) to make sure you're not using too many useless features. |