If I wanted to extract product names from text, how would I get training data? E.g. 'makeup' or 'Dolce & Gabbana' might both be terms that would indicate a product that someone might buy. Doing a Google search to see if there are ads on the term seems like a good way, but they'd ban you as a bot before long. What other ways would there be to determine if a word might have commercial significance?
asked Nov 30 '11 at 11:45