I have a machine learning code for Bag of Visual Words in Python which works well and produces good and meaningful results. I need to move the code to C++. I wrote the code for C++ but I am not getting the same expected results and now i stuck on this point couldn't find what is wrong in 10 days struggling with the code. so, I will explain the code, I might doing something wrong and maybe someone could come up with an idea:
- I am using openCV and LibSVM libraries in my code
- main algorithm I use is kmeans clustering and svm
each step of my algorithm works just as BOW(bag of words): first I extract my visual features from video frames and I store them in a struct like below: notice that activities is an array of structures which hold features of my videos and related information of each.
Mat Features = allFeatures(Range::all(),Range(40,136));
activities[i].activityLabel = file_to_action(descriptorNameList.at(i));
activities[i].activityIntLabel = actionLabel_to_action_num(activities[i].activityLabel);
activities[i].frameSize = framesize;
activities[i].subjectID = file_to_subject(descriptorNameList.at(i));
features.copyTo(activities[i].Features);
then, from the videos I select those that I use for training and I extract features of them and put them to a big Mat openCV matrix and then I will call kmeans algorithm to perform clustering on this matrix and my features:
Mat TrainingDescriptors;
and inside a loop I push back training features into this matrix:
TrainingDescriptors.push_back(activities[i].Features);
and the kmeans part:
int dictionarySize = 4000;
Mat dictionary;
Mat labels;
Mat TrainingDescriptorsFloat;
TrainingDescriptors.convertTo(TrainingDescriptorsFloat,CV_32F);
kmeans(TrainingDescriptorsFloat,dictionarySize,labels,TermCriteria( CV_TERMCRIT_EPS|CV_TERMCRIT_ITER, 10000, 0.0001),5, KMEANS_PP_CENTERS, dictionary);
now I have my visual dictionary and resulted labels. next I will create histograms out of the resulted labels for each of my videos. (because my videos each have different frame size, I keep number of frames and I use it when I want to create histograms of resulted labels from kmeans). inside a loop I create histograms:
id = Orders.at<int>(i);
int from = activities[id].frame_from;
int to = activities[id].frame_to;
Mat labelsForHist = labels(Range(from,to),Range(0,1));
Mat labelsForHistFloat;
labelsForHist.convertTo(labelsForHistFloat, CV_32F);
calcHist(&labelsForHistFloat,1,channels,Mat(),hist,1,&histSize,&histRange,uniform,accumulate);
normalize(hist,hist,1,0,NORM_L1,-1,Mat());
hist.copyTo(activities[id].Histogram);
I convert labels to float because apparently calcHist function just accepts float and I also normalize the histograms with opencv normalize function and save them in each struct.
I though that my problem was in creating histograms as I explained here before.
now I should calculate histograms for my testing videos and after that put the training and testing histograms to SVM.
I create histograms of testing videos by comparing a video's features at each frame with resulted dictionary (by comparing their l2 distances). inside a loop:
for(int j= 0;j < sz;j++)
{
float score;
float minscore = 1000;
float lbl;
for(int k = 0; k< dictionarySize;k++)
{
score = norm(features.at<float>(j),dictionary.at<float>(k),NORM_L2,Mat());
if(score < minscore)
{
minscore =score;
lbl = k;
}
}
lbels.push_back(lbl);
}
and then I calculate histograms inside another loop this way:
id = testingOrders.at<int>(i);
int from = activities[id].frame_from;
int to = activities[id].frame_to;
Mat labelsForHist = lbels(Range(from,to),Range(0,1));
Mat labelsForHistFloat;
labelsForHist.convertTo(labelsForHistFloat, CV_32F);
calcHist(&labelsForHistFloat,1,channels,Mat(),hist,1,&histSize,&histRange,uniform,accumulate);
normalize(hist,hist,1,0,NORM_L1,-1,Mat());
hist.copyTo(activities[id].Histogram);
now I read histograms like the format has explained in LibSVM documentation with my parameters. these are the parameters for my SVM, in some posts people say the parameter must be tuned to get proper results but I changed them and tried with several values.
param.svm_type = C_SVC;// NU_SVC
param.kernel_type = RBF;// //LINEAR // POLY //SIGMOID
param.degree = 3;
param.gamma = 1; // 1/num_features /0,0.5,1,10
param.coef0 = 0;
param.nu = 0.5;
param.cache_size = 100;
param.C = 10; //1,10,1000
param.eps = 1e-3;
param.p = 0.1;
param.shrinking = 1;
param.probability = 0;
param.nr_weight = 0;
param.weight_label = NULL;
param.weight = NULL;
cross_validation = 0;
I train a vector machine with train histograms in prob and machines parameter and keep the trained model:
model = svm_train(&prob,¶m);
and with the trained model this is how I test the testing videos:
if(allActivities[i].order == testingOrders.at<int>(j))
{
int a = elements.at(j);
//x_space_test = Malloc(struct svm_node,a+1);
x_space_test = (struct svm_node *) realloc(x_space_test,(a+1)*sizeof(struct svm_node));
Mat hist;
allActivities[i].Histogram.copyTo(hist);
cv::Size histsize = hist.size();
int hs = histsize.height;
int l = 0;
for(int k=0; k < hs; k++)
{
if(hist.at<float>(k,0) != 0)
{
x_space_test[l].value = hist.at<float>(k,0);
x_space_test[l].index = k+1;
cout<<x_space_test[l].value<<" "<<x_space_test[l].index<<endl;
l++;
}
}
x_space_test[l].value = 0;
x_space_test[l].index = -1;
cout<<x_space_test[l].value<<" "<<x_space_test[l].index<<endl;
target_label = (double)allActivities[i].activityIntLabel;
cout<<"target label: "<<target_label;
if (predict_probability && (svm_type==C_SVC || svm_type==NU_SVC))
{
predict_label = svm_predict_probability(model,x_space_test,prob_estimates);
fprintf(output,"%g",predict_label);
for(j=0;j<nr_class;j++)
fprintf(output," %g",prob_estimates[j]);
fprintf(output,"\n");
}
else
{
predict_label = svm_predict(model,x_space_test);
fprintf(output,"%g\n",predict_label);
}
cout<<" predict label: "<<predict_label<<endl;
if(predict_label == target_label)
++correct;
error += (predict_label-target_label)*(predict_label-target_label);
sump += predict_label;
sumt += target_label;
sumpp += predict_label*predict_label;
sumtt += target_label*target_label;
sumpt += predict_label*target_label;
++total;
break;
}
the problem is that I always get same labels for all the videos e.g label is always 2. if I change the parameters some times I get other labels but they are all wrong and I got 0% performance while I got 98% performance in python.
1) I suspected the histograms (opencv's calcHist function) as I explained in other post is not working properly. I create them in matlab and they were same as c++ but different from python's.
now 2) I think their might be something wrong in kmeans algorithm and it is not working well like scikit minibatch kmeans algorithm in python.
or 3) last guess is that problem is with SVM part of the code. I am doing something wrong in SVM
sorry for long post but maybe somebody has ideas which can help me to get out of frustration.