|
I have been doing NN with kdtrees, but now my tree is so big that kdtree queries take half a second. What is the best alternative to kdtrees? Right now I'm looking at LSH as an option |
|
Approximate principal direction trees is a paper I saw at ICML 2012. It develops an alternatives to kd-trees that they claim scales better. As I recall the algorithm it was quite simple to implement, and very fast. |
|
If you data is small to medium dimensional (i.e. less than 50 dimension) you can have a look at the BallTree implementation in scikit-learn's neighbors package. If you accept approximate answers then have a look at FLANN as Dougal said. Indeed, the core idea here is that BallTrees scale to higher dimensions than KDTree (see reference in scikit-learn documentation).
(Aug 06 '12 at 12:29)
Gael Varoquaux
|
|
The FLANN (Fast Library for Approximate Nearest Neighbors) library provides good implementations of kd-tree forests, k-means trees, and LSH for approximate nearest-neighbor searches, along with a way to automatically select an algorithm and tune parameters. I've been using vl_feat's kdtree, which is apparently an implementation of FLANN. I'm assuming both are the same, so I don't think using FLANN would make a big difference
(Jul 30 '12 at 13:52)
mugetsu
KD-tree is an exact method. FLANN provides many alternative approximation that are much faster and state of the art. Usually there is a parameter to adjust the trade off between expected exactness and speed of the queries.
(Jul 31 '12 at 09:12)
ogrisel
|