Hey all,

I've had a nerdcrush on machine learning for a long time now, and I'm hoping to learn about how it's actually implemented, what the limits of machine learning are and have a little fun in the process.

I'm building a website that collects music, specifically tracks that have been 'favorited' by each user, from many of the open music platforms on the web. I'm in the process of building this into a radio station that plays for you, the music that's been favorited by your friends. You tune the station by picking a handful of friends you want to listen to, and then it plays a collection of music that they have in common.

None of this is terribly hard, but I'd like to make the predictions better.

I'd like to say, "if a user is listening to such and such group of friends, he's likely to want just to hear just this specific subset of all the music I could play for him." How can machine learning help me do this? Would it be good at it? What kind of data would help it get better at making these predictions?

I'm also curious what kind of hardware / servers I'd need to handle doing these kinds of calculations. The data set gets really big. Like, really big. But maybe not that big, since I have no reference for how much data is reasonable to throw into a ML setup...

I hope I'm not being too noobish, I'm very excited to learn how this all comes together.

Thanks for your time, ~ Jordan

asked Oct 28 '10 at 00:33

Jordan%20Feldstein's gravatar image

Jordan Feldstein
16112


2 Answers:

My research has to do with those questions, but for human activities.

Your basic problem is to set clusters, that will be personalized for each user you have.

You might want to take a look on clustering techniques, or like Noel mentioned, check the techniques they used at the Netflix contest.

answered Oct 29 '10 at 03:31

Leon%20Palafox's gravatar image

Leon Palafox ♦
40857194128

The general problem you're solving is called collaborative filtering. The recent Netflix prize was a large-scale collaborative filtering problem. Reading about the Netflix prize (the top teams published their algorithms and setups) will give you an idea of what is considered large, what resources you need, and what algorithms would make a good starting point.

answered Oct 28 '10 at 02:00

Noel%20Welsh's gravatar image

Noel Welsh
72631023

1

Indeed. Tools that solve the collaborative fitering task are also called recommender systems (aka recsys or resys). Here are some pragmatic pointers to get started:

How to build a recommender system?

Open source collaborative filtering and recommendation systems

Wikipedia entry on Slope One

(Oct 29 '10 at 04:33) ogrisel
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.