Pynanoflann Save

Unofficial python wrapper to the nanoflann k-d tree

Project README

pynanoflann

Unofficial python wrapper to the nanoflann library [1] with sklearn interface and additional multithreaded capabilities.

nanoflann implementation of k-d tree provides one of the best performance for many k-nearest neighbour problems [2].

It is a good choice for exact k-NN, radius searches in low dimensional spaces.

Install

pip install git+https://github.com/u1234x1234/[email protected]

Basic Usage

import numpy as np
import pynanoflann

index = np.random.uniform(0, 100, (100, 4))
queries = np.random.uniform(0, 100, (10, 4))

nn = pynanoflann.KDTree(n_neighbors=5, metric='L1', radius=100)
nn.fit(index)

# Get k-nearest neighbors
distances, indices = nn.kneighbors(queries)

# Radius search
distances, indices = nn.radius_neighbors(queries)

Save, load

If you need to save the model, there are two options:

Save only the built index. In this case, data points are NOT stored in file. Very efficient, but inconvenient in some cases.


kdtree.fit(X)
kdtree.save_index('index.bin')

prebuilt_kdtree = pynanoflann.KDTree()
# Must use the same data on which the index was built.
prebuilt_kdtree.fit(X, 'index.bin')

Please refer to the detailed example

Pickle the whole model with data points. Less efficient, but convenient.

kdtree.fit(X)
with open('kdtree.pkl', 'wb') as out_file:
    pickle.dump(kdtree, out_file)
with open('kdtree.pkl', 'rb') as in_file:
    unpickled_kdtree = pickle.load(in_file)

Please refer to the detailed example

Multicore parallelization

Implemented on C++ side to reduce python's multiprocessing overhead.

Query parallelization: Example
Simultaneous indexing+querying parallelization: Example, Discussion

Performance

Generally it much faster than brute force or cython implementation of k-d tree in sklearn

To run benchmark:

python benchmark.py

Links

Blanco, Jose Luis and Rai, Pranjal Kumar, 2014, nanoflann: a C++ header-only fork of FLANN, a library for Nearest Neighbor ({NN}) with KD-trees.
Vermeulen, J.L., Hillebrand, A. and Geraerts, R., 2017. A comparative study of k‐nearest neighbour techniques in crowd simulation.
OpenFLANN Performance comparison between PCL, nanoflann, picoflann

Open Source Agenda is not affiliated with "Pynanoflann" Project. README Source: u1234x1234/pynanoflann

Stars

Open Issues

Last Commit

1 year ago

Repository

u1234x1234/pynanoflann

License

BSD 2-clause

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/pynanoflann"><img src="https://www.opensourceagenda.com/projects/pynanoflann/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022