I am interested in comparing ranked lists and quantifying how "similar" the rankings are. What are some good scores/measures to do that?

The rankings are produced by different runs of a system. On each run, it is given n instances and assigns a score to each, resulting in a ranking of the instances. But there is non-determinism, so the system comes up with slightly different results each time. I'm not so much interested in the actual scores that were assigned to the instances, but I want to evaluate the order that is produced on the different runs.

asked Sep 09 '10 at 07:51

Frank's gravatar image

Frank
1349274453


3 Answers:

If all positions in the rank are equally important the standard measure is Kendall's tau statistic, which counts the number of inversions between two permuations. This goes from 0 to O(n²). It can be defined as a metric between permutations. If you really want contiguous blocks of elements (but don't care if the blocks themselves are in the wrong order), you can try a variant of kendall's tau that only penalizes adjacent swaps. I'm not sure this is a metric, however.

If only the top k positions are of interest you can use precision@k (precision until it returns the top k ranked elements) or recall@k (among the top k returned elements, how many are atually among the top k elements). You can also compute all values of precision@k and recall@k and compute a precision-recall curve, and compute the area under that. If your elements are weighted you can use NDCG or ERR, possibly truncating at k. These can be used to evaluate an ordering as long as the true ordering is scored, but not the ordering being evaluated.

answered Sep 09 '10 at 09:49

Alexandre%20Passos's gravatar image

Alexandre Passos ♦
2554154278421

If you want a probability distribution of rankings given the correct one, Mallow's model is a pretty standard one. you can find his original paper on it here .

answered Sep 13 '10 at 14:26

Joseph%20Austerweil's gravatar image

Joseph Austerweil
331118

Further to Alex's comment, there is also Spearman's correlation and Gamma, which are non-parametric correlation metrics (Kendall's tau is one as well). A description can be found here. I use these to compare word lists (which are definitely not Gaussian, so Pearson's can't be used).

answered Sep 09 '10 at 21:30

Robert%20Layton's gravatar image

Robert Layton
1625122637

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.