|
Background: I am a software engineering intern (sophomore CS student at UC Berkeley) given the task of creating a module on a site that given a user's Facebook information will make recommendations to the user for concerts, sporting events, and other live events he/she would be likely to be interested in. Problem details: We are pulling a Facebook user's profile information, including "likes", "interests", "activities", and "location". For those that don't use Facebook, here is an example of what the "likes" category would look like (they are all JSON formatted). { "data": [
] } In addition, for each live event we have a list of keywords associated with it. For instance, a U2 concert could have the following keywords associated with it: <keywords>U2, U2 New Meadowlands Stadium, U2 East Rutherford,U2 Concert</keywords> My problem is that I don't know where to start making recommendations based on this information. The Facebook information is highly unstructured. Furthermore, there are many ways of going about making recommendations, such as making recommendations based on a user's location or favorite teams or rival teams, etc. At this point, I don't have to utilize all the data to make recommendations; starting off simple and making more complicated recommendation systems is fine but I at least want to have a rigorous framework to build off of. I can think of a couple rudimentary ways of making recommendations (ie. simple word matching between events and likes) but I'd like to hear the opinions of experts in this field in how they would approach the problem. I am looking for answers that: 1) Recommend algorithms or procedures that can be used in solving this problem 2) Suggestions for books/papers that discuss such problems Please tell me if the problem statement is too unclear. Thanks a lot for the help. |
|
Start with baselines for the most popular concerts and other events. Recommend them things their friends like as another starting point. Read some of the papers from the Netflix challenge. You can easily apply many of those algorithms using edge distance as a metric for user similarity. Trying those things should help initially. The keyword you want is "collaborative filtering". |
|
I have several tricks in mind: if your problem is the poor semantics of each like you may try to use external knowledge sources such as Wikipedia/DBpedia to get the more precise type/category of any particular like. You may also try to measure google distance between item in the like and event keywords and rank events depending on the results. |