|
I am trying to do Name Entity Recognition and relation extraction on clinical text notes. Since this domain is specific, I have couple of questions: For Name Entity Recognition: I have learned http://www.nltk.org/book/ch07.html. It said for NER, we can use classifier function:
which is a classifier that has already been trained well. But, in medicine notes, the nouns are all specific terms which will not be recognized by the normal trained classifier, so we have to train the classifier using medicine corpus by ourselves. Do you know where can I find such database containing chunked well corpus targeted for my goal? For Relation Extraction: The book mentioned above just told the rule-based system relation extraction, I want to learn machine learning-based system relation extraction, Do you know other resources introducing this? I know we need annotated relations corpus annotated by hand before ML. But what I don't know is that what does the annotated relation corpus look like? |