Is there any difference between two learning methods joint learning and multitask learning method? If yes, then how they are different? Explanation with simple example will be very helpful.

asked Jan 05 at 17:16

Thetna's gravatar image

Thetna
1015510


One Answer:

To understand these things you should see what they try to fix.

In natural language processing it is common to want a full analysis of a sentence. This involves tagging words with parts of speech, identifying named entities, identifying noun phrases, parsing, labelling semantic roles, etc. Usually these tasks were attacked separately, and algorithms for solving them were composed in a pipeline: first you POS-tag, then you do NER, then you do parsing, etc. Joint learning is doing these things all in one go. This is hard because the labeled data for different tasks are not the same, the kinds of structured models used for them do not always compose in a computationally efficient way, it is not always clear how to send information back and forth, and different tasks are "solved" to different levels. The thing which determines joint learning is the fact that it always involves structured models; in a way it is more about inference than learning, and you're uniting more than one different structured model.

In general machine learning it is common that there are many related things you want to, say, build classifiers for. Suppose I run a mail server; if I want spam filtering each user's filter is a different problem but they share a lot of structure. Likewise for tagging reviews by sentiment: some linguistic marks transfer between, say, hotel rooms and cellphones ("good reception", for example, or "bad service") but some things don't ("small"). In these cases you have many different tasks and you want to share some information between them, so you are doing multi task learning. MTL is about learning classifier parameters, and it really doesn't care what sits behind those parameters. Usually papers on MTL talk about the same model all the time, and this model is often just binary/multiclass classification.

answered Jan 05 at 17:49

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
1899744214335

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.