|
I'm playing around with SIFT descriptors for images in an attempt to use them in a 'bag-of-words' style approach to image classification. I've cobbled together an assortment of random images, and after extracting features I have about 1.15 million assorted feature descriptors. I've been mucking around with using I'm looking for suggestions, whether that be a methodology, paper recommendation, or a smack to the side of the head. |
|
How about "Near duplicate image detection: min-hash and tf-idf weighting". Minhash will let you create signatures based on LSH family. You can create multiple signatures and use those as a "bag-of-words" for the image to perform duplicate detection. +1 as it's not a bad idea. That might be something for me to look at when I have more free time; in this case I was looking for something more on the order of simple statistical methods or whatever.
(Oct 14 '11 at 19:17)
Brian Vandenberg
1
I heard good comments about chapter 3 of "Mining of Massive Datasets" http://infolab.stanford.edu/~ullman/mmds.html
(Oct 15 '11 at 02:11)
Mathieu Blondel
|