Information Organization: A case study in music recommendations

Sum­mary

I intro­duce “infor­ma­tion orga­ni­za­tion”, an approach which I have been explor­ing for sev­eral years. As a case study, music rec­om­men­da­tions should be orga­nized, but exist­ing appli­ca­tions cur­rently orga­nize music rec­om­men­da­tions poorly. I dis­cuss issues with cur­rent appli­ca­tions, and dis­cuss fea­tures that address these issues.


Back­ground

Infor­ma­tion orga­ni­za­tion is basi­cally a fam­ily of pat­terns for col­lect­ing, pre­sent­ing, and nav­i­gat­ing infor­ma­tion that is struc­tured and/or tex­tual. These pat­terns draw upon ideas from NLP, ML, IR, UX, viz, and in gen­eral the pat­terns can be imple­mented as loosely-coupled com­po­nents. The com­bi­na­tion of the pat­terns leads to potent cross-interactions that add up more to the sum of the indi­vid­ual pat­terns. This will be more clear when we dive into the case study. These pat­terns are non-trivial to imple­ment, and require some savvy in NLP or IR. For peo­ple knowl­edge­able in these arts, there is an oppor­tu­nity to cre­ate dif­fer­en­ti­ate your offer­ing and cre­at­ing unique value by imple­ment­ing these patterns.

As a fam­ily of pat­terns, infor­ma­tion orga­ni­za­tion is not tied to any par­tic­u­lar appli­ca­tion. Rather, it admits a vari­ety of related appli­ca­tions that can build upon imple­mented pat­terns. In this case study, I’ll be talk­ing about an appli­ca­tion to orga­ni­za­tion music recommendations.

I have been for sev­eral years been devel­op­ing this con­cept of infor­ma­tion orga­ni­za­tion. Some of it has been imple­mented, but much of it has been designed but not yet imple­mented yet.


The prob­lem

Peo­ple enjoy shar­ing music rec­om­men­da­tions, but lack an effec­tive plat­form for shar­ing these rec­om­men­da­tions. We are dis­cussing here about shar­ing music pref­er­ences in numer­i­cal form (scores) and tex­tual form (reviews). I ignore the ques­tion of shar­ing the audio itself.

Here are some cur­rent approaches:

  • Directly make a rec­om­men­da­tion to a friend, online or off. Prob­lem: Inher­ently tran­sient and non-archival form of shar­ing. Also, there is no way for me to get rec­om­men­da­tions from new people.
  • Lis­ten to the radio, online or off. Prob­lem: Not social. Inher­ently tran­sient and non-archival form of sharing.
  • I write a blog arti­cle for myself or a larger pub­lisher (e.g. Pitch­fork) with my review of the music. Prob­lem: Social fea­tures are lim­ited. Dis­cus­sion of par­tic­u­lar songs is frag­mented across pub­lish­ers, which means that rec­om­men­da­tions are not being effec­tively shared.
  • I share a Youtube link on Face­book, and dis­cus­sion ensues. Prob­lem: The dis­cus­sion is cir­cum­scribed purely by my social cir­cle, and I don’t have a mech­a­nism for con­nect­ing with peo­ple out­side my cir­cle with whom I would nonethe­less like to share music rec­om­men­da­tions. Also, the his­tor­i­cal archive is not acces­si­ble, and pre­vi­ous rec­om­men­da­tions are not searchable.

So what we’re get­ting at is a music rec­om­men­da­tion sys­tem that has social, archival, and rec­om­men­da­tion features.


Approach

Here’s an appli­ca­tion that could solve these prob­lems. Click on the mockup image for a larger view of it.

Note that I am going to talk about many pos­si­ble fea­tures for this appli­ca­tion. If you are going to imple­ment some­thing, you should focus on core fea­tures. I talk about to vari­ety of pos­si­ble fea­tures to illus­trate infor­ma­tion orga­ni­za­tion pat­terns that are tech­ni­cally fea­si­ble but not yet commonplace.

At its core, there are two kinds of user activity:

  • Nav­i­gat­ing recommendations.
  • Adding rec­om­men­da­tions.

I’ll focus on nav­i­ga­tion, since nav­i­ga­tion sug­gests many of the most impor­tant features.

When nav­i­gat­ing infor­ma­tion, con­sider view­ing the rec­om­men­da­tions for a par­tic­u­lar song. This page will con­tain dif­fer­ent rec­om­men­da­tions for the song. These rec­om­men­da­tions will be sum­ma­rized as expand­able text snip­pets, and are pre­sented in a ranked order. For exam­ple, if you have reviewed this song, your rec­om­men­da­tion will rank at the top. Rec­om­men­da­tions from your friends have higher rank, as do rec­om­men­da­tions from peo­ple with sim­i­lar taste to you. (Some users care more about the taste of their friends, and the rank­ing should reflect that. Some users care more about the taste of peo­ple with sim­i­lar tastes, and for them the rank­ing should reflect that.) Less impor­tant, but nonethe­less use­ful, is the objec­tive “author­ity” of the source. For exam­ple, rec­om­men­da­tions by well-respected crit­ics like Pitch­fork have higher rank than rec­om­men­da­tions by unknown crit­ics, if there is not enough social or per­sonal infor­ma­tion to rank the recommendations.

Another aspect of nav­i­gat­ing is search. Search should imple­ment auto-complete and auto-suggest. Con­tent should be auto-tagged based upon exist­ing music meta-data as well as reviews, so that search­ing for “bounce” will find all bounce tracks, even if the term “bounce” is not explic­itly men­tioned in any review of a par­tic­u­lar track. Auto-tagging can be smoothed across dif­fer­ent tracks by the same artist, as well as dif­fer­ent tracks that have reviews that con­tain the same keywords.

Another aspect of nav­i­gat­ing is find­ing related enti­ties (entity = song, musi­cian, genre, tag, etc.). Besides see­ing pop­u­lar songs by the same musi­cian, it is also use­ful to see pop­u­lar songs in the same genre, related tags, etc. Auto-tagging helps again here to fig­ure out how related two enti­ties are.

There is also the issue that there is no portable open data for­mat for record­ing numer­i­cal pref­er­ences about some entity (AFAIK). Sim­ply for­mal­iz­ing the exchange of pref­er­ence infor­ma­tion (not just for music, but any type of entity) would be a big deal.


Recap

We have dis­cussed a hand­ful of dif­fer­ent com­po­nents (rank­ing based upon social graph, nav­i­gat­ing based upon related music, etc.). These fea­tures are non-trivial to imple­ment, and require some NLP or IR savvy. The chal­lenge in imple­ment­ing these fea­tures poses an oppor­tu­nity to those who can. The more fea­tures imple­mented, the more value is cre­ated based upon their inter­ac­tions, so the appli­ca­tion can phase shift to a higher ech­e­lon of qual­ity. But there is clearly value in a music rec­om­men­da­tion sys­tem that has only a par­tial fea­ture set. Which fea­tures are the eas­i­est to imple­ment that cre­ate the most value upfront? I believe that social fea­tures, and inte­gra­tion with Face­book and/or Twit­ter can add a lot of value in terms of cre­at­ing engagement.

What is the min­i­mum viable prod­uct for this task?
Pos­si­ble answers:

  • You log in with twit­ter, and type a review of a song or a band. This will be auto-tweeted, but also added to the rec­om­men­da­tion page for this song or band.
  • Scrape web reviews and rec­om­men­da­tions. Extract a sum­mary text snip­pet for each. (Note that this is a non-trivial fea­ture.) Aggre­gate these snip­pets on the site.

A ben­e­fit of the first approach is that it is inher­ently social. A ben­e­fit of the sec­ond approach is that it com­bats the “cold start” prob­lem, i.e. it imme­di­ately pop­u­lates the data­base with use­ful information.

I am curi­ous what you think is a good min­i­mum viable prod­uct. Is per­son­al­iza­tion of the view part of the core fea­ture set?


About the author

I con­sult on data strat­egy (NLP, ML, busi­ness intel­li­gence, etc.)
If you are inter­ested in build­ing out any of these ideas, get in touch with me and I can help. In par­tic­u­lar, I can advise on how to build the ML + NLP com­po­nents. I’ll help you get a prac­ti­cal pro­to­type up-and-running really quick, and show you how to refine and improve the com­po­nents as nec­es­sary. Part of the art in imple­ment­ing infor­ma­tion orga­ni­za­tion is iden­ti­fy­ing the com­po­nents that add the most imme­di­ate value, and quickly imple­ment­ing a solid baseline.

Enhanced by Zemanta
  • http://twitter.com/turian/status/24589679672 Joseph Turian

    Infor­ma­tion Orga­ni­za­tion: A case study in music rec­om­men­da­tions http://bit.ly/cWltXR

  • http://twitter.com/hntweets/status/24590016470 Hacker News

    Infor­ma­tion Orga­ni­za­tion: A case study in music rec­om­men­da­tions: http://bit.ly/asG4sk Com­ments: http://bit.ly/aHi2U7

  • http://twitter.com/hackernewsyc/status/24590253003 Hacker News YC

    Infor­ma­tion Orga­ni­za­tion: A case study in music rec­om­men­da­tions http://goo.gl/fb/mIwNj

  • http://twitter.com/rohi81/status/24591272555 Rohit Nal­lapeta

    RT @HackerNewsYC: Infor­ma­tion Orga­ni­za­tion: A case study in music rec­om­men­da­tions http://goo.gl/fb/mIwNj

  • atpas­sos ml

    RT @turian: Infor­ma­tion Orga­ni­za­tion: A case study in music rec­om­men­da­tions http://bit.ly/cWltXR

  • http://twitter.com/atpassos_ml/status/24593730248 Alexan­dre Passos

    RT @turian: Infor­ma­tion Orga­ni­za­tion: A case study in music rec­om­men­da­tions http://bit.ly/cWltXR

  • http://twitter.com/rohi81 Rohit Nal­lapeta

    Your post has some excel­lent points. I am inter­ested in the area of rec­om­men­da­tions espe­cially sub­jec­tive and objec­tive rec­om­men­da­tions. I con­sider objec­tive rec­om­men­da­tions are irre­spec­tive of the per­sonal aspect. It goes to rec­om­men­da­tion of the prod­uct on a list of bench­marks such as the eco­nomic value, cost, demand and gen­eral value. Then there is the aspect of per­sonal rec­om­men­da­tions or sub­jec­tive choice where things take a dif­fer­ent dimen­sion or gain weights based on social aspects of the per­son him­self… would like to dis­cuss more an pitch.

  • http://twitter.com/jingle/status/24616415719 Jin­gle

    RT @turian: Infor­ma­tion Orga­ni­za­tion: A case study in music rec­om­men­da­tions http://bit.ly/cWltXR

  • http://twitter.com/thevarungupta/status/24618673549 Varun Gupta
  • http://twitter.com/ogrisel Olivier Grisel

    Inter­est­ing analy­sis. I think your approach is match­ing the “entity hub” con­cept that is emerg­ing in the linked data com­mu­nity: http://bnode.org/blog/2010/07/30/dynamic-semantic-publishing-for-any-blog-part-1

    Only you add the auto­mated extrac­tion layer using NLP named entity detec­tion + IR sim­i­lar­ity queries.

  • http://twitter.com/kalpeshk/status/24663014315 Kalpesh Khivasara

    Infor­ma­tion Orga­ni­za­tion: A case study in music rec­om­men­da­tions http://bit.ly/cLOemg

  • http://metaoptimize.com Joseph Turian

    Sure, send me an email. I’d love to hear what you’re work­ing on.

  • http://twitter.com/xlvector/status/24713583939 xlvec­tor

    Infor­ma­tion Orga­ni­za­tion: A case study in music rec­om­men­da­tions http://ff.im/-qJhfB

  • http://twitter.com/wangfengmadking/status/24727759787 Char­lie Epps

    RT @xlvector: Infor­ma­tion Orga­ni­za­tion: A case study in music rec­om­men­da­tions http://ff.im/-qJhfB

  • http://twitter.com/sebpaquet Seb Paquet

    Inter­est­ing idea. It would be use­ful if you con­trasted your approach with exist­ing, state-of-the-art social music rec­om­men­da­tion sites, like Last.fm and Grooveshark.

  • http://twitter.com/gerryeaton/status/27824461766 Gerry Eaton

    RT @turian: Infor­ma­tion Orga­ni­za­tion: A case study in music rec­om­men­da­tions http://bit.ly/cWltXR

  • http://twitter.com/gerryeaton/status/27824461766 Gerry Eaton

    RT @turian: Infor­ma­tion Orga­ni­za­tion: A case study in music rec­om­men­da­tions http://bit.ly/cWltXR

  • http://twitter.com/gerryeaton/status/27824461766 Gerry Eaton

    RT @turian: Infor­ma­tion Orga­ni­za­tion: A case study in music rec­om­men­da­tions http://bit.ly/cWltXR

  • http://twitter.com/gerryeaton/status/27824461766 Gerry Eaton

    RT @turian: Infor­ma­tion Orga­ni­za­tion: A case study in music rec­om­men­da­tions http://bit.ly/cWltXR

  • http://twitter.com/gerryeaton/status/27824461766 Gerry Eaton

    RT @turian: Infor­ma­tion Orga­ni­za­tion: A case study in music rec­om­men­da­tions http://bit.ly/cWltXR

blog comments powered by Disqus