I have been doing NN with kdtrees, but now my tree is so big that kdtree queries take half a second. What is the best alternative to kdtrees? Right now I'm looking at LSH as an option

asked Jul 26 '12 at 13:39

mugetsu's gravatar image

mugetsu
233212431


3 Answers:

Approximate principal direction trees is a paper I saw at ICML 2012. It develops an alternatives to kd-trees that they claim scales better. As I recall the algorithm it was quite simple to implement, and very fast.

answered Jul 26 '12 at 14:29

Noel%20Welsh's gravatar image

Noel Welsh
72631023

If you data is small to medium dimensional (i.e. less than 50 dimension) you can have a look at the BallTree implementation in scikit-learn's neighbors package.

If you accept approximate answers then have a look at FLANN as Dougal said.

answered Jul 31 '12 at 09:15

ogrisel's gravatar image

ogrisel
498995591

Indeed, the core idea here is that BallTrees scale to higher dimensions than KDTree (see reference in scikit-learn documentation).

(Aug 06 '12 at 12:29) Gael Varoquaux

The FLANN (Fast Library for Approximate Nearest Neighbors) library provides good implementations of kd-tree forests, k-means trees, and LSH for approximate nearest-neighbor searches, along with a way to automatically select an algorithm and tune parameters.

answered Jul 29 '12 at 16:48

Dougal%20Sutherland's gravatar image

Dougal Sutherland
16113

I've been using vl_feat's kdtree, which is apparently an implementation of FLANN. I'm assuming both are the same, so I don't think using FLANN would make a big difference

(Jul 30 '12 at 13:52) mugetsu

KD-tree is an exact method. FLANN provides many alternative approximation that are much faster and state of the art. Usually there is a parameter to adjust the trade off between expected exactness and speed of the queries.

(Jul 31 '12 at 09:12) ogrisel
Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.