|
I am doing some classification tasks and i have 8 categories. The test results are as follows and it comes from a very popular implementation that i don't think has bugs: Precision : 0.5309078104890105 Recall : 0.6637245030504184 Accuracy : 0.8354180381830152 F1 Score : 0.5899330173622899 I know the formula for precision, recall, F1 and Accuracy but i'm curious how i can interpret these results? F1 is very low compared to my other experiments but Accuracy is very high. My classes are skewed and about 50 percent of documents belong to one category. I did some calculations: I have done these claculations: precision = tp /tp + fp = 0.53 recall = tp / tp + fn = 0.66 accuracy = tp + tn / tp + tn + fp + fn = 0.83 tp = fp 0.66 tp + 0.66 fn = tp => 0.34 tp = 0.66 fn => tp = 2fn tp = 2fn tp + tn / tp + tn + tp + 0.5 tp = 0.83 tp + tn / 2.5 tp + tn = 0.83 0.27 tn = tp Is there any good explanation for this? |
Is there any way you can plot a ROC curve? That can be illuminating. I find it easier than wading through the numbers.