Background: I am a software engineering intern (sophomore CS student at UC Berkeley) given the task of creating a module on a site that given a user's Facebook information will make recommendations to the user for concerts, sporting events, and other live events he/she would be likely to be interested in.

Problem details: We are pulling a Facebook user's profile information, including "likes", "interests", "activities", and "location". For those that don't use Facebook, here is an example of what the "likes" category would look like (they are all JSON formatted).

{ "data": [

  {
     "name": "The Green Mile",
     "category": "Movie",
     "id": "212526758758843",
     "created_time": "2011-05-17T23:58:57+0000"
  },
  {
     "name": "No Country For Old Men",
     "category": "Movie",
     "id": "171450412909941",
     "created_time": "2011-05-05T22:50:50+0000"
  },
  {
     "name": "Iron Maiden",
     "category": "Musician/band",
     "id": "172685102050",
     "created_time": "2011-04-03T23:17:44+0000"
  },
  {
     "name": "Black Sabbath",
     "category": "Musician/band",
     "id": "56848544614",
     "created_time": "2011-03-24T07:47:03+0000"
  },
  {
     "name": "Taxi Driver",
     "category": "Movie",
     "id": "129628777110156",
     "created_time": "2011-03-09T09:01:17+0000"
  },
  {
     "name": "Internet Problem Solving Contest (IPSC)",
     "category": "Website",
     "id": "123586307655716",
     "created_time": "2010-06-02T02:19:37+0000"
  },
  {
     "name": "\u0645\u0646 \u0639\u0627\u0634\u0642 \u0686\u0644\u0648 \u06a9\u0628\u0627\u0628 \u0647\u0633\u062a\u0645",
     "category": "Community",
     "id": "105913032784838",
     "created_time": "2010-04-28T03:55:24+0000"
  },
  {
     "name": "O Brother, Where Art Thou?",
     "category": "Movie",
     "id": "111994392146832",
     "created_time": "2010-04-28T03:42:39+0000"
  },
  {
     "name": "The Godfather trilogy",
     "category": "Movie",
     "id": "114177228593871",
     "created_time": "2010-04-28T03:42:39+0000"
  },
  {
     "name": "The Dollars Trilogy",
     "category": "Movie general",
     "id": "357757340492",
     "created_time": "2010-04-28T03:42:39+0000"
  },
  {
     "name": "Techno",
     "category": "Musical genre",
     "id": "105708646128371",
     "created_time": "2010-04-28T03:42:39+0000"
  },
  {
     "name": "Speed metal",
     "category": "Musical genre",
     "id": "107945175892500",
     "created_time": "2010-04-28T03:42:39+0000"
  },
  {
     "name": "Donnie Brasco",
     "category": "Movie",
     "id": "103096339730254",
     "created_time": "2010-04-28T03:42:39+0000"
  },
  {
     "name": "Doctor Strangelove How I Learned to Love the Bomb",
     "category": "Movie general",
     "id": "392294291146",
     "created_time": "2010-04-28T03:42:39+0000"
  },
  {
     "name": "Everything is funnier when you're with your bestfriend.",
     "category": "Non-profit organization",
     "id": "227454654627",
     "created_time": "2010-01-11T21:04:11+0000"
  },
  {
     "name": "Facebook Engineering",
     "category": "Computers/technology",
     "id": "9445547199",
     "created_time": "2009-12-09T21:17:14+0000"
  }
  {
     "name": "Summer Break!",
     "category": "Club",
     "id": "85386132170",
     "created_time": "2009-07-04T03:26:17+0000"
  },
  {
     "name": "Bruce Lee",
     "category": "Public figure",
     "id": "184049470633",
     "created_time": "2009-07-03T01:02:01+0000"
  }

] }

In addition, for each live event we have a list of keywords associated with it. For instance, a U2 concert could have the following keywords associated with it: <keywords>U2, U2 New Meadowlands Stadium, U2 East Rutherford,U2 Concert</keywords>

My problem is that I don't know where to start making recommendations based on this information. The Facebook information is highly unstructured. Furthermore, there are many ways of going about making recommendations, such as making recommendations based on a user's location or favorite teams or rival teams, etc.

At this point, I don't have to utilize all the data to make recommendations; starting off simple and making more complicated recommendation systems is fine but I at least want to have a rigorous framework to build off of. I can think of a couple rudimentary ways of making recommendations (ie. simple word matching between events and likes) but I'd like to hear the opinions of experts in this field in how they would approach the problem.

I am looking for answers that: 1) Recommend algorithms or procedures that can be used in solving this problem 2) Suggestions for books/papers that discuss such problems

Please tell me if the problem statement is too unclear.

Thanks a lot for the help.

asked Jun 27 '11 at 15:31

Sam%20K's gravatar image

Sam K
1112


2 Answers:

Start with baselines for the most popular concerts and other events. Recommend them things their friends like as another starting point. Read some of the papers from the Netflix challenge. You can easily apply many of those algorithms using edge distance as a metric for user similarity.

Trying those things should help initially. The keyword you want is "collaborative filtering".

answered Jun 28 '11 at 00:48

zaxtax's gravatar image

zaxtax ♦
1051122545

I have several tricks in mind: if your problem is the poor semantics of each like you may try to use external knowledge sources such as Wikipedia/DBpedia to get the more precise type/category of any particular like. You may also try to measure google distance between item in the like and event keywords and rank events depending on the results.

answered Jun 27 '11 at 19:23

Sergey%20Bartunov's gravatar image

Sergey Bartunov
81111116

Your answer
toggle preview

Subscription:

Once you sign in you will be able to subscribe for any updates here

Tags:

×2
×2
×1

Asked: Jun 27 '11 at 15:31

Seen: 1,980 times

Last updated: Jun 28 '11 at 00:48

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.