|
I'm deciding which document to assign to which user with a naive bayes classifier and each users document history. Currently this works pretty well with a small set of users that don't overlap much. Each user has their own classifier ['interesting', 'not interesting'] where 'interesting' is their documents and 'not interesting' is everyone elses documents. I'm trying to understand my options for scaling this system to thousands of users (many with similar preferences) and millions of documents. As I see it I can use:
Are all of these feasible? Is there a clear best path? Can you recommend any reading material that deals with this? |