Is it better to start with a blank mind when trying to make a Natural Language Processing and read as you see the problem after trying to solve it yourself OR should you read enough before you kickstart your work on the application?

asked Jul 14 '10 at 11:22

ArchieIndian's gravatar image

ArchieIndian
9951011

edited Jul 15 '10 at 10:58

Joseph%20Turian's gravatar image

Joseph Turian ♦♦
579051125146

3

Wasn't this question a Fiona Apple album?

(Jul 14 '10 at 11:48) Richie Cotton

I asked this question out of a little bit of experience. We are a bunch of guys who did take a course in NLP. We started off on a project with some concepts in mind and reached almost nowhere. We cleared ourselves of all sorts of concepts and then tried using common sense instead of other things and we are now on a track which seems right.

(Jul 15 '10 at 04:39) ArchieIndian
1

The question edit completely ruins Richie Cotton's Fiona Apple joke. It's certainly better for the site over all, but it was a good joke :/

(Jul 15 '10 at 11:26) Andrew Rosenberg

7 Answers:

You will find that your intuitions are likely to be misleading when working on natural language processing tasks. Human beings have both built-in neural hardware for language processing and huge amounts of experience with the world and with using language. Your software will have none of that.

So...read as much as you can about previous approaches to solving the problem you care about. Look for repeating themes across a wide range of work, since an individual researcher will often be incorrect about what the actually reasons for success or failure of a particular technique were.

answered Aug 21 '11 at 18:43

Dave%20Lewis's gravatar image

Dave Lewis
890202846

Think about the problem, ask questions, try to find answers with the help of papers, discussions with colleagues, etc. Then start implementing and experimenting.

answered Jul 16 '10 at 01:54

Frank's gravatar image

Frank
1349274453

My approach to learning new things:

  • Formulate the problem
  • Try to find solution without any special prior knowledge (do not code at this level!)
  • Read papers/learn/reproduce some results, maybe
  • Try to solve/write code
  • Read papers/learn/...
  • Try to solve again/write code
  • .....

This is iterational process and it converges )

answered Jul 15 '10 at 03:51

bijey's gravatar image

bijey
46226

Just as a complement to Alexandre Passos excellent list. If you do start w/o reading anything, you'll have a better appreciation for what you do read. e.g. if you've banged your head trying to tackle a problem, then read an article or chapter on how the current way to tackle the problem works, you'll appreciate it more. You'll understand some of the issues, because you met them before you were exposed to the theory and the general approaches.

answered Jul 14 '10 at 23:29

Sean%20McKay's gravatar image

Sean McKay
314

Surely it depends on how much experience you have with natural language processing. If you already have a lot, it can be extremely instructional to try and fail with the tools you know, because when you finally read up and learn a new one, you know exactly why you need it, and why (in practical terms) it works better than anything you knew already.

If you don't have a lot of experience, you risk stumbling upon a correct or high-test-accuracy answer by accident or because of a bug, and not really learning anything.

answered Jul 14 '10 at 18:29

aditi's gravatar image

aditi
85072034

Both could work, and both induce different biases.

Not reading anything makes it easier to (among other things):

  • Formulate the problem in a subtly wrong way that makes no sense linguistically or leads you towards bad ways of solving it
  • Not use standard evaluation metrics/not optimize them
  • Ignore easy solutions to possible subproblems (for example, trying to resolve coreference without doing named entity recognition or part of speech tagging or chunking)
  • Fall into tempting dead ends/think a standard baseline is a very cool new technique
  • Think a special case of the problem you find interesting is interesting to the general community while it is trivial/uninteresting/loosely formulated/etc
  • Think that a good design decision that most techniques make is bad, while an early paper actually showed the alternative was a lot worse.

(I've made all of these, btw)

Reading everything you can get your hands on, on the other hand, might lead you to fail to see different approaches to solving the problem that can substantially improve upon existing techniques.

I guess, overall, it's good to have an acquaintance with previous work but also keep in mind that it does not necessarily encompass the totality of solution space (after all, for example, interesting linear binary classifiers, such as confidence-weighted algorithms or budgeted stochastic svms, are still being developed).

answered Jul 14 '10 at 12:12

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

Think first, program later. Great advice, seldom listened to.

answered Jul 14 '10 at 11:49

Richie%20Cotton's gravatar image

Richie Cotton
15145

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.