|
Which one can be more useful for making structure prediction between support vector machine and neural network in terms of: 1)cost 2)implementation complexity 3)time Are there any genereal field where either of them is particularly suitable. Recently going through some article in protein structure prediction, I came to know that Support Vector machine is more suitable than neural network. Thanks in advance
showing 5 of 6
show all
|
|
There are essentially two ways in which you can use neural networks for structured prediction in the NLP/vision sense of the term: (1) you can use a neural network to score a feature representation like the one used by CRFs or structured SVMs, and like beam search to find labellings, and something like perceptron to update your weights (backpropagating), (2) you can do something like Richard Socher does in his recurrent neural networks (see for example the paper on parsing ), where you reduce structured prediction to learning a compositionality operator and embedding things in a latent space (roughly). For an overview of these approaches you can look at Yann LeCun's energy-based models tutorial. Regardless of which is better (NNs or SVM-like things), it really depends on your problem space. If you can design good enough features such that there is almost a linear separation between good and bad labellings, then SVMs are going to perform better. If you can't, it's possible (but not guaranteed) that neural networks will do better. However, neural networks are more expensive to train, and they can suffer from problems with inexact inference (as exact inference is intractable by definition in these models). Incase if we perform multitask learning taking some related auxillary task so as to generalize the model and reducing the count of features to take, what do you think which model architecture will be more suitable?
(Jan 19 '12 at 14:02)
Thetna
1
This is still not enough information to decide, and will change when you change problem domain, all else fixed. It all comes down to how well do your features match the labels.
(Jan 19 '12 at 14:03)
Alexandre Passos ♦
|
|
Could you give us some more information about your available dataset. Generally a neural network should perform similar to a support vector machine in many different cases. Except you can train a neural network on billions of examples (e.g. unlabelled) where an SVM will explode. Also, you can construct recursive neural networks, and it's not clear how to compute recursive SVMs.
(Jan 26 '12 at 04:25)
Joseph Turian ♦♦
|
Can you give a reference for structured prediction with neural networks? I never heard of that.
Here is a wiki link, stating neural network for protein strucutre prediction: http://en.wikipedia.org/wiki/Protein_structure_prediction#Machine_learning
This is a very specific application and the only reference I found there is a book that I obviously don't have. Is there any source describing how to to structured prediction with neural networks? I am working with structured prediction and as I said I never heard of this. There seems to be some papers in biology but I haven't found any papers in machine learning describing the method.
I am no expert in Neural Networks but I am not sure why structured prediction with Neural Network should not work. People have tried structured prediction with SVMs by taking input feature as the combined input and output structure and then try to make a prediction based on argmax over all possible outputs. You can have a look at the paper "Support Vector Machine for Interdependent and Structured Output Spaces(ICML 2004, Tsochantaridis et al) " . So why should this not be possible with NN as well, basically the cost function which is being optimized changes?
Yes, it might in principal work. The question was more "has anyone studied this". I would have liked to look at a reference paper.
Thetna ask which of the two methods is preferred but I don't see Neural Networks for structured prediction being present in the literature at all.
A more technical aspect to answer you question: I think one important aspects of structured SVMs is doing loss-augmented inference for training. If I'm not completely mistaking, the fact that the gradient only depends on "the wrongest" solution is due to the hinge-loss like behavior of the loss-functions used. If you don't do this, I would expect that you explicitly have to sum over all possible labels to compute the gradient. That would usually be infeasible.
Another reference on doing structured prediction using neural networks is the work by Weston + Collobert.
For parsing, there is the work of Henderson et al, which use neural networks for parsing. These models had the highest parsing accuracy for a couple of years.
In general, one can decompose structured prediction problems to a sequence of intermediate classification decisions, depending upon your choice of logic (should I push this item onto the stack or reduce, for the stack-reduce parsing logic). You can use any classifier to learn these decisions, e.g. a neural network. This is called a "history-based model".
See this question for more detail: http://metaoptimize.com/qa/questions/2535/history-based-models-for-structured-prediction
Read my thesis for even more detail. Or post a followup question.