Given a large collection of text documents that contain multi-lingual terms, such as Germany and Chinese, are there good approaches to build feature matrix? For a single langugage classification tasks, n-gram modeling works fine for most of the cases.

asked Aug 01 '14 at 15:54

huaiyanggongzi's gravatar image

huaiyanggongzi
71447

Be the first one to answer this question!
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.