|
I have a text classification problem were the classes are 20 cities and the input is text Bag of word features. I am using Logistic Regression and my cost function is negative log likelihood: negative of the sum/mean of all samples' log(p_y_predicted). I need to change my cost function so that it addresses my main objective which is to minimize the sum of predicted distances of each input sample to its predicted one in terms of geographical coordinates. so for each 20 cities (classes) we have a coordinate [latitude , longitude] and for each piece of text in the training data we have that too. train data = (text, [latitude, longitude], city] I don't want to use regression i want to predict the class and then lookup the coordinates and add the distance of the predicted class to the known coordinate of the text. I know it seems a bit difficult to grasp but i'm writing this hoping that someone may have seen the same problem or is wise enough to understand this problem and help me... |
I guess this problem is called non uniform cost function or cost sensitive classification?
I have a misclassification cost matrix already so for each pair of classes i know the cost of misclassification. I just need to know the gradient.
I am trying to understand your setup. You are predicting one of 20 cities, correct? And you want to minimize the distance between the predicted city and the true city?
Yes, That's right.