|
I'm implementing object recognition with bags of words histograms, where each image is represented by a histogram of my visual vocab. The histograms are composed of 200 "words" per image, kmeans from the descriptors. The problem is that for a large dataset, say 5000 images, we suddenly have 200x5000=1,000,000 possible words in the vocab. This means that every object will be represented by a 1,000,000 length histogram. This gets too big and cumbersome past some point. Is there someway around this? |
|
As far as I know, in NLP you have to manage very large datasets. Remember that your training data will have very few significant values, and the values for the others are zero or near zero. This means you'll be handling sparse matrices. Some languages have very good implementations for sparse matrices. Matlab in particular has a sparse matrix data structure. You can always create your own, where you create a python like dictionary where you only define those values for which you have an entry. thanks! Didn't realize there was this option
(Jul 25 '12 at 12:20)
mugetsu
|
Why do you have words per image? You could simply run k-means on patches extracted from all 5000 images (let's say you get 4096 means from that), and now your full bag-of-visual-words data set will be a 4096x5000 matrix.
@Laurens van der Maaten this is for online learning