|
I am not an expert of SVM and Kernel. My question is how to select suitable kernel function for a problem? LibSVM Guide suggests 'cross-validation' techniques for this purpose starting with RBF kernel function. Then I read some papers which consider incorporating prior knowledge about the problem in hand in kernel selection. Particularly, class-invariance property and knowledge about data. I am curious to know, how exactly information about data set can be extracted and be used for kernel selection? Or is there other way to guide kernel selection process? So far, after reading related articles I have some idea about selecting suitable kernel using prior knowledge (though I am not sure this is meaningful or not) :
Any help would be appreciable and very sorry if I bother you. Thanks in advance. |
|
To follow Leon and Andreas' answer you might want to look at "Kernel Methods in Computer Vision" by Christoph Lampert. There, he explains that since a kernel is viewed as some sort of similarity measure, then the following heuristic can help in choosing one.
Chapter 3 of the same article talks about incorporating invariance into the kernel instead of the feature extraction procedure and various kinds of kernels which is what I think you were looking for. 1
Yeah this is a good book and definitely a good place to get a feeling for this methods. Though afaik Christoph usually uses chi2 and rbf himself ;)
(May 28 '12 at 15:18)
Andreas Mueller
@Pardis - Thank you very much for the help.
(May 28 '12 at 16:02)
Raihana
|
|
Hi Raihanna. I'm not sure what the class invariant property is, so I can not comment on that. In general, the kind of kernel you use is often dictated by your representation of the data. For example there is a lot of work on kernels on sequences and trees and general graphs or sets. There are also domain specific kernels, developed in certain communities. If your data representation is just a real vector, most people use linear or RBF kernels. Linear is good since it is fast and needs barely any tuning, RBF is usually the best performing non-linear one. I haven't really seen any other kernels being used in many applications. One exception is computer vision. Here the data is often represented as histograms (so they are non-negative and sum to one) where chi2 and intersection kernels have proven to be more effective than rbf.
This answer is marked "community wiki".
@Andreas Mueller - My data representation is just a real vector. Each sample is represented using 40 features (30 of them are binary and the remaining containing value between 1-20). I got to know that RBF is usually the first choice so I've started with linear and then RBF. But finally I got good result using polynomial kernel. Can you give me any idea/explanation why polynomial works better than RBF? Thanks in advance.
(May 29 '12 at 04:29)
Raihana
No. ;) It might depend on the data normalization. For RBF, using zero mean, unit variance is good. But it is usually still quite sensitive to the kernel width gamma. Doing some handwaving, the polynomial kernel might work better if the true function is easily expressed as a polynomial.
(May 29 '12 at 04:33)
Andreas Mueller
|
|
Usually you use Cross-Validation and test the error of the kernel you used over a held out data set. For example: You divide your data set in 3 (Training, CV, and Test): First you train lets say 5 SVMs with different kernels each (or different parameters for the same Kernel) Then you test your trained results in your CV sets for each approach. After this, you choose the one that had the best performance in its CV set to test in the final test set. Usually when you use Cross Validation, you try to rotate the CV sets over all the possibilities, so you can have an average solution. @Leon Palafox - Thanks for your answer. Besides 'cross-validation' approach, can you give me any idea (or any explanation or reference) about incorporating prior knowledge (about the learning problem) in kernel selection. Particularly, how and what information can be extracted from the training samples to influence the kernel selection. Thanks again.
(May 28 '12 at 09:32)
Raihana
|