0
5

Deep learning has been shown time and again to outperform everything as shown yann lecun or andrew ng (except when online logistic regression is ok for large datasets). So why are people, on this forum for example, discussing anything else? Why are outclassed things like svm's, topic models, crf's discussed.?

asked Jul 18 '12 at 14:07

marshallp's gravatar image

marshallp
8391016

2

I'm not sure if this is a joke or what, but this is what you get by too much PR of deep learning....

(Jul 18 '12 at 14:19) Dov
2

you might want to see this: http://www.youtube.com/watch?v=4Ak3g67LXTY&feature=plcp.

page 177 here http://nlp.stanford.edu/~socherr/SocherBengioManning-DeepLearning-ACL2012-20120707-NoMargin.pdf

(Jul 18 '12 at 14:33) Dov
1

Those are not really mutually exclusive. You can use deep learning to come up with better representations, and then stick SVM or CRF on top. The disagreement is really whether to use hand-tuned features vs. automatically learned features. In NLP the best results come from using a combination of hand-tuned + learned features

(Jul 20 '12 at 20:51) Yaroslav Bulatov

3 Answers:
11

From a practical point of view, deep learning needs a lot of data. That's not always available. Also, it takes a lot of time to train a large network. That also might not be suitable for all applications.

But I think the major thing you have to remember is that scientific papers aren't equal to a working industrial system.

What you see in the paper is the "best result" for the method and you don't see all the failed attempts that came before it. Deep Learning is not an out-of-the-box solution. It requires a lot of optimization and tuning both in the parameters and in the input data of the system.

And if you review enough deep learning papers, you see that they don't outperform everything. On some papers they aren't the best and again, if it would perform very badly then it wouldn't be published anyway.

Lastly, there are way more problems to solve than the ones you see in the deep learning papers. Deep learning is the current trend so it seems to be everywhere but it's not. There are many areas where it is not applicable (at least yet..).

answered Jul 18 '12 at 14:19

rm9's gravatar image

rm9
586203833

edited Jul 18 '12 at 14:20

15

Your question comes off as a bit sarcastic, but I will answer it assuming it isn't.

As a deep learning researcher, I will be the first to admit that deep learning is a poorly defined term. There are many reasonable definitions for it, some more expansive and some more restrictive. For example, the most expansive definition I might use would include any learning algorithm that learns a distributed representation of its input and isn't just doing template matching. This doesn't require a neural net (unless you also use a very expansive definition of neural net that includes decision trees!). Sidestepping the definitional problem of exactly what constitutes deep learning, let me try and address your question.

  1. Deep learning is not appropriate for all problems. Small datasets might be better served by a fully Bayesian approach with a careful encoding of prior beliefs. Perhaps there are certain types of bandit problems one would be hard-pressed to use a deep architecture on or learn features for. Even in classification, classifying graphs (where graphs are the actual instances) poses challenges for deep learning and many classification techniques throughout machine learning. Sometimes particular graphical models can be engineered for a problem. Sometimes other models are more computationally appropriate. Sometimes we can directly parameterize a discrete probability distribution and learn its parameters. Some of these situations are waiting for an enterprising researcher to apply the deep learning philosophy, but others it simply seems inappropriate.

  2. All the things you mentioned can be part of deep models. A CRF can be deep or even be the top layer of a deep model. There are deep topic models (I am thinking of the replicated softmax with additional layers added). Learning the kernel in an SVM can sometimes result in a deep model. All of machine learning is related and we as researchers can generalize across models. For example, SVMs taught us a very important lesson (often articulated by Yann LeCun) about training neural nets, namely that we shouldn't be afraid of using models with many parameters and we should just control overfitting some other way and use more powerful regularization. Also, sometimes training a neural net with the hinge loss is useful. Machine learning is a giant interconnected web of ideas and as the field continues, people keep finding new relationships between seemingly disparate concepts and models.

To summarize, deep learning isn't the answer for every problem and all these other techniques have things to teach people, even people who are only interested in deep learning.

Although I agree with the general sentiment of rm9, I think it might be possible to make a relatively "out of the box" deep neural network system, especially with recent advances in hyper-parameter optimization. Just like with SVMs, the best results will come from SVM experts, but reasonable results should be possible automatically with deep neural nets as long as we give non-experts sufficiently sophisticated software packages and they have access to enough computation.

answered Jul 19 '12 at 04:43

gdahl's gravatar image

gdahl ♦
341453559

edited Jul 22 '12 at 02:06

in summery: http://www.no-free-lunch.org/

(Jul 19 '12 at 08:14) Dov

Because there is no free lunch?

answered Jul 26 '12 at 00:18

Lucian%20Sasu's gravatar image

Lucian Sasu
513172634

1

http://en.wikipedia.org/wiki/No_free_lunch_in_search_and_optimization

(Jul 27 '12 at 14:20) Steven Hansen
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.