1
1

Hi,

I created the lucene-based search system. Now I wonder how to test it.

That system search through ads to find the best one. So the distinction with the common search engines is that I need only the best (mb 1-3) most relevant document. Should I use standard metrics like mean average precision (MAP) or I need something completely different?

Thanks!

asked Aug 17 '11 at 05:51

Konstantin's gravatar image

Konstantin
34181218


One Answer:

You can use precision@K and/or NDCG@K or even MAP@K for small values of K. Setting K to 1 is not always such a good idea as all metrics can easily be zero. It also depends on what kind of labels you have (for all you know there could be more than one right ad, or each ad has a hidden "goodness" score, or something like that).

answered Aug 17 '11 at 07:53

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

Thanks Alexandre. I was thinking about MAP@k. I agree that setting k=1 isn't a good way.

I guess I have no other scores of 'goodness' beside the one given by engine.

btw Can you give some links to precision@K description? I'm quite new to IR. I would be very obliged.

(Aug 17 '11 at 10:53) Konstantin
1

Konstantin: recall@K is of the K top returned items how many are right. Precision@K is, let L be the number of elements you see until you see K true elements, K/L. That is, assuming we want precision@3 and the labels of the item list are 0 0 1 0 1 0 1 0 1 1 1 1. Then we go until we found 3 true items (0 0 1 0 1 0 1) and count how many we have seen (7), then it is 3/7. You can think of it as precision when recall is fixed to be K items. Precision@1 then is the inverse position of the best item.

(Aug 17 '11 at 11:12) Alexandre Passos ♦

Ah, got it, thanks.

(Aug 18 '11 at 03:49) Konstantin
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.