In this doc, we benchmark the performance on multiple K-Nearest Neighbor algorithms implemented by :func:`dgl.knn_graph`.
Given a dataset of ``N`` samples with ``D`` dimensions, the common use case of KNN algorithms in graph learning is to build a KNN graph by finding the ``K`` nearest neighbors for each of the ``N`` samples among the dataset.
Empirically, the three parameters, ``N``, ``D``, and ``K``, all have impact on the computation cost. To benchmark the algorithms, we pick a few represensitive datasets to cover most common scenarios:
* A synthetic dataset with mixed gaussian samples: ``N = 1000``, ``D = 3``.
* A point cloud sample from ModelNet: ``N = 10000``, ``D = 3``.
* Subsets of MNIST
- A small subset: ``N = 1000``, ``D = 784``
- A medium subset: ``N = 10000``, ``D = 784``
- A large subset: ``N = 50000``, ``D = 784``
Some notes:
* ``bruteforce-sharemem`` is an optimized implementation of ``bruteforce`` on GPU.
* ``kd-tree`` is currently only implemented on CPU.
* ``bruteforce-blas`` conducts matrix multiplication, thus is memory inefficient.
* ``nn-descent`` is an approximate algorithm, and we also report the recall rate of its result.
Results
-------
In this section, we show the runtime and recall rate (where applicable) for the algorithms under various scenarios.
The experiments are run on an Amazon EC2 P3.2xlarge instance. This instance has 8 vCPUs with 61GB RAM, and one Tesla V100 GPU with 16GB RAM. In terms of the environment, we obtain the numbers with DGL==0.7.0(`64d0f3f <https://github.com/dmlc/dgl/commit/64d0f3f3554911ec06d015f1c9659180796adf9a>`_), PyTorch==1.8.1, CUDA==11.1 on Ubuntu 18.04.5 LTS.