|
Hi, I created the lucene-based search system. Now I wonder how to test it. That system search through ads to find the best one. So the distinction with the common search engines is that I need only the best (mb 1-3) most relevant document. Should I use standard metrics like mean average precision (MAP) or I need something completely different? Thanks! |
|
You can use precision@K and/or NDCG@K or even MAP@K for small values of K. Setting K to 1 is not always such a good idea as all metrics can easily be zero. It also depends on what kind of labels you have (for all you know there could be more than one right ad, or each ad has a hidden "goodness" score, or something like that). Thanks Alexandre. I was thinking about MAP@k. I agree that setting k=1 isn't a good way. I guess I have no other scores of 'goodness' beside the one given by engine. btw Can you give some links to precision@K description? I'm quite new to IR. I would be very obliged.
(Aug 17 '11 at 10:53)
Konstantin
1
Konstantin: recall@K is of the K top returned items how many are right. Precision@K is, let L be the number of elements you see until you see K true elements, K/L. That is, assuming we want precision@3 and the labels of the item list are 0 0 1 0 1 0 1 0 1 1 1 1. Then we go until we found 3 true items (0 0 1 0 1 0 1) and count how many we have seen (7), then it is 3/7. You can think of it as precision when recall is fixed to be K items. Precision@1 then is the inverse position of the best item.
(Aug 17 '11 at 11:12)
Alexandre Passos ♦
Ah, got it, thanks.
(Aug 18 '11 at 03:49)
Konstantin
|