Can anybody offer (or point to) a simple (and ideally intuitive) explanation of Area under the ROC curve? asked Jul 11 '10 at 01:37 Anton Ballus John L Taylor 
Simply put, the area under the curve (AUC) of a receiver operating characteristic (ROC) curve is a way to reduce ROC performance to a single value representing expected performance. To explain with a little more detail, a ROC curve plots the true positives (sensitivity) vs. false positives (1 − specificity), for a binary classifier system as its discrimination threshold is varied. Since, a random method describes a horizontal curve through the unit interval, it has an AUC of .5. Minimally, classifiers should perform better than this, and the extent to which they score higher than one another (meaning the area under the ROC curve is larger), they have better expected performance. For example, in the graph below you can see that in terms of AUC, VA (blue) outperforms NE (pink), which is quite a bit better than random (black). This is the best introduction I have read on the subject, and I urge you to read it to get a good sense of the use and abuse of ROC and AUC. answered Jul 11 '10 at 02:45 John L Taylor @John L Taylor, the link to the introduction is really really good
(Feb 08 '12 at 09:57)
VassMan

Peter Flach asks that to his students in his ML exams, and shares the answer in his keynote talk on ECML 2007. At around minute 43 he gives several simple definitions for AUC as a performance measure for (bipartite) rankings. AUC is:
answered Jul 12 '10 at 06:15 Santi Villalba 
What's your definition of simple? :). I see the AUC as the probability P(X>Y) where X is a random variable corresponding to the distribution of outputs for the positive examples and Y is the one corresponding to the negative examples. Let's say a classifier outputs some score f(x) for a given example x. You've got N positive examples xi, i=1...N and M negative examples, yj, j=1...M. An unbiased estimator of the AUC is then sum_{i} sum_{j} 1_{f(x_i)>f(y_j)} / (N*M), where 1_{f(x_i)>f(y_j)} is 1 if f(x_i)>f(y_j) and zero otherwise. answered Jul 11 '10 at 02:38 Dumitru Erhan 
Well, here is my attempt to explain the ideas behind ROC concept:
From the story I know, in WW II after Allied had invented Radar, they needed a method to tell if a point on screen was some atmospheric disturbance or an enemy plane. To achieve this ROC curves were used. ROC curves tell you that how errors change when you change the separating value Consider this you have two classes [planes or nothing], now the radar returns a certain value between 0 and 1. E.g. Consider following data now lets start with 0.5 as class boundary i.e. value > 0.5 is an aircraft while 0.5 < value is just a noise. As you can see from the data with 0.5 there is some error. Since you have two data points which are misclassified as an Aircraft. Thus at 0.5 we note False Positive rate which tells us about misclassified instances which were actually negative (Y) and True Positive Rate i.e. Rate of Correctly classified positive instances (X). This gives us a pair (X,Y) which is a single point in ROC graph.
Also X and Y are both rates thus they are bounded by [0,1] Now we repeat this process for all value of boundary separation from 0[minimum value] to 1[maximum value]. this will give us [X,Y] pairs which we then plot (X against Y) and find the Area Under the Curve. There are few observations regarding AUC:
Note: I am marking this as community wiki, please correct any Grammatical/Factual Mistakes. For much deeper understanding
This answer is marked "community wiki".
answered Jul 11 '10 at 14:50 DirectedGraph +1 for the respect towards the word 'simple'.
(Mar 02 '14 at 02:29)
Arun Vijayan

Another material I found useful for explaining ROC/AUC coming from the bioinformatics community: P. Sonego, A. Kocsor, and S. Pongor. ROC analysis: applications to the classification of biological sequences and 3D structures. Briefings in Bioinformatics, 9(3):198–209, 2008. answered Jul 12 '10 at 04:56 Georgiana Ifrim 