|
I am trying to get the data from memetracker and mine the best media websites. Any suggestions on the selection of the algorithm to use on this kind of dataset and why? I am trying to extract the websites with pc of top quotes and which website is the best, some thing similar to http://memetracker.org/lag.html |
Can you explain which problem you want to solve exactly?
updated...
Do you mean, the problem of downloading web pages with time stamps or the problem of determining which are talking about the same news event?