|
I am wondering if there has been any research into how to incorporate our confidence in each training data into the machine learning model. Specifically I have a bunch of training data for each I know how reliable they are
This means that there is 95% probability that the label for X1 is "True" and 5% chance that it is "False". Similarly X2 is True with the probability of 70% and False with 30% probability. And finally the probability of X3 having label "True" is only 90%. Note that these are training data. I am using a random forest classification model and training on this data. Is there any trick for me to use the confidence to do a better training? I looked for research papers but unfortunately could not find anything relate to this problem. |