|
I am currently working on a RandomForest based prediction method using protein sequence data.
I have generated two models first model (NF) using standard set of features and the second model (HF) using hybrid features. I have done Mathews Correlation Coefficient (MCC) and Accuracy calculation and the following are my results: Model 1 (NF): Training Accuracy - 62.85% Testing Accuracy - 56.38 MCC - 0.1673 Model 2 (HF): Training Accuracy - 60.34 Testing Accuracy - 61.78 MCC - 0.1856 Since there is a trade-off in accuracy and MCC between the models am confused about the prediction power of the models. Could you please share your thoughts on which model I should consider for further analysis. |
According to train and test accuracy, it looks like Model 1 is overfitting. What is MCC?
Frank thanks for your suggestion. MCC = Mathews correlation coefficient.
Cross posted this question to Cross Validated. http://stats.stackexchange.com/questions/5093/statistical-validation-of-randomforest-models