Unverified Commit 2d31cba9 authored by Ruilong Li(李瑞龙)'s avatar Ruilong Li(李瑞龙) Committed by GitHub
Browse files

Complete docs for 0.5.0 (#184)

* missing docs for the 4 dropin examples

* fix docs

* profiling in coding.rst
parent ba78cbdc
BARF
====================
More to come...
\ No newline at end of file
In this example we showcase how to plug the nerfacc library into the *official* codebase
of `BARF <https://chenhsuanlin.bitbucket.io/bundle-adjusting-NeRF/>`_. See
`our forked repo <https://github.com/liruilong940607/barf/tree/90440d975fc76b3559126992b2fbce27dd02456f>`_
for details.
Benchmark: NeRF-Synthetic Dataset
---------------------------------
*updated on 2023-04-04 with nerfacc==0.5.0*
Our experiments are conducted on a single NVIDIA GeForce RTX 2080 Ti.
+-----------------------+-------+-------+---------+-------+-------+-------+-------+-------+-------+
| PSNR | Lego | Mic |Materials| Chair |Hotdog | Ficus | Drums | Ship | MEAN |
| | | | | | | | | | |
+=======================+=======+=======+=========+=======+=======+=======+=======+=======+=======+
| BARF | 31.16 | 23.87 | 26.28 | 34.48 | 28.4 | 27.86 | 31.07 | 27.55 | 28.83 |
+-----------------------+-------+-------+---------+-------+-------+-------+-------+-------+-------+
| training time | 9.5hrs| 9.5hrs| 9.2hrs | 9.3hrs|12.3hrs| 9.3hrs| 9.3hrs| 9.5hrs| 9.8hrs|
+-----------------------+-------+-------+---------+-------+-------+-------+-------+-------+-------+
| camera errors (R) | 0.105 | 0.047 | 0.085 | 0.226 | 0.071 | 0.846 | 0.068 | 0.089 | 0.192 |
+-----------------------+-------+-------+---------+-------+-------+-------+-------+-------+-------+
| camera errors (T) | 0.0043| 0.0021| 0.0040 | 0.0120| 0.0026| 0.0272| 0.0025| 0.0044| 0.0074|
+-----------------------+-------+-------+---------+-------+-------+-------+-------+-------+-------+
| Ours (occ) | 32.25 | 24.77 | 27.73 | 35.84 | 29.98 | 28.83 | 32.84 | 28.62 | 30.11 |
+-----------------------+-------+-------+---------+-------+-------+-------+-------+-------+-------+
| training time | 1.5hrs| 2.0hrs| 2.0hrs | 2.3hrs| 2.2hrs| 1.9hrs| 2.2hrs| 2.3hrs| 2.0hrs|
+-----------------------+-------+-------+---------+-------+-------+-------+-------+-------+-------+
| camera errors (R) | 0.081 | 0.036 | 0.056 | 0.171 | 0.058 | 0.039 | 0.039 | 0.079 | 0.070 |
+-----------------------+-------+-------+---------+-------+-------+-------+-------+-------+-------+
| camera errors (T) | 0.0038| 0.0019| 0.0031 | 0.0106| 0.0021| 0.0013| 0.0014| 0.0041| 0.0035|
+-----------------------+-------+-------+---------+-------+-------+-------+-------+-------+-------+
Dynamic NeRFs
===================================
The :class:`nerfacc.PropNetEstimator` can natually work with dynamic NeRFs. To make the
:class:`nerfacc.OccGridEstimator` also work with dynamic NeRFs, we need to make some compromises.
In these examples, we use the :class:`nerfacc.OccGridEstimator` to estimate the
`maximum` opacity at each area `over all the timestamps`. This allows us to share the same estimator
across all the timestamps, including those timestamps that are not in the training set.
In other words, we use it to cache the union of the occupancy at all timestamps.
It is not optimal but still makes the rendering very efficient if the motion is not crazyly significant.
Performance Overview
--------------------
*updated on 2023-04-04*
+----------------------+-----------+----------------------------------+-----------------------+--------------------------+
| Methods | Dataset | Training Time :math:`\downarrow` | PSNR :math:`\uparrow` | LPIPS :math:`\downarrow` |
......
K-Planes
====================
More to come...
\ No newline at end of file
In this example we showcase how to plug the nerfacc library into the *official* codebase
of `K-Planes <https://sarafridov.github.io/K-Planes/>`_. See
`our forked repo <https://github.com/liruilong940607/kplanes/tree/b97bc2eefc18f00cd54833800e7fc1072e58be51>`_
for details.
Benchmark: D-NeRF Dataset
---------------------------------
*updated on 2023-04-04 with nerfacc==0.5.0*
Our experiments are conducted on a single NVIDIA GeForce RTX 2080 Ti.
+----------------------+----------+---------+-------+---------+-------+--------+---------+-------+-------+
| PSNR | bouncing | hell | hook | jumping | lego | mutant | standup | trex | MEAN |
| | balls | warrior | | jacks | | | | | |
+======================+==========+=========+=======+=========+=======+========+=========+=======+=======+
| K-Planes | 39.10 | 23.95 | 27.76 | 31.11 | 25.18 | 32.44 | 32.51 | 30.25 | 30.29 |
+----------------------+----------+---------+-------+---------+-------+--------+---------+-------+-------+
| training time | 68min | 70min | 70min | 70min | 70min | 71min | 72min | 71min | 70min |
+----------------------+----------+---------+-------+---------+-------+--------+---------+-------+-------+
| Ours (occ) | 38.95 | 24.00 | 27.74 | 30.46 | 25.25 | 32.58 | 32.84 | 30.49 | 30.29 |
+----------------------+----------+---------+-------+---------+-------+--------+---------+-------+-------+
| training time | 41min | 41min | 40min | 40min | 39min | 39min | 40min | 38min | 40min |
+----------------------+----------+---------+-------+---------+-------+--------+---------+-------+-------+
.. _`TiNeuVox Example`:
TiNeuVox
====================
More to come...
\ No newline at end of file
In this example we showcase how to plug the nerfacc library into the *official* codebase
of `TiNeuVox <https://jaminfong.cn/tineuvox/>`_. See
`our forked repo <https://github.com/liruilong940607/tineuvox/tree/0999858745577ff32e5226c51c5c78b8315546c8>`_
for details.
Benchmark: D-NeRF Dataset
---------------------------------
*updated on 2023-04-04 with nerfacc==0.5.0*
Our experiments are conducted on a single NVIDIA GeForce RTX 2080 Ti.
+----------------------+----------+---------+-------+---------+-------+--------+---------+-------+-------+
| PSNR | bouncing | hell | hook | jumping | lego | mutant | standup | trex | MEAN |
| | balls | warrior | | jacks | | | | | |
+======================+==========+=========+=======+=========+=======+========+=========+=======+=======+
| TiNeuVox | 39.37 | 27.05 | 29.61 | 32.92 | 24.32 | 31.47 | 33.59 | 30.01 | 31.04 |
+----------------------+----------+---------+-------+---------+-------+--------+---------+-------+-------+
| Training Time | 832s | 829s | 833s | 841s | 824s | 833s | 827s | 840s | 833s |
+----------------------+----------+---------+-------+---------+-------+--------+---------+-------+-------+
| Ours (occ) | 40.56 | 27.17 | 31.35 | 33.44 | 25.17 | 34.05 | 35.35 | 32.29 | 32.42 |
+----------------------+----------+---------+-------+---------+-------+--------+---------+-------+-------+
| Training Time | 378s | 302s | 342s | 325s | 355s | 360s | 346s | 362s | 346s |
+----------------------+----------+---------+-------+---------+-------+--------+---------+-------+-------+
Benchmark: HyperNeRF Dataset
---------------------------------
*updated on 2023-04-04 with nerfacc==0.5.0*
Our experiments are conducted on a single NVIDIA GeForce RTX 2080 Ti.
+----------------------+----------+---------+-------+-------------+-------+
| PSNR | 3dprinter| broom |chicken| peel-banana | MEAN |
| | | | | | |
+======================+==========+=========+=======+=============+=======+
| TiNeuVox | 22.77 | 21.30 | 28.29 | 24.50 | 24.22 |
+----------------------+----------+---------+-------+-------------+-------+
| Training Time | 3253s | 2811s | 3933s | 2705s | 3175s |
+----------------------+----------+---------+-------+-------------+-------+
| Ours (occ) | 22.72 | 21.27 | 28.27 | 24.54 | 24.20 |
+----------------------+----------+---------+-------+-------------+-------+
| Training Time | 2265s | 2221s | 2157s | 2101s | 2186s |
+----------------------+----------+---------+-------+-------------+-------+
| Ours (prop) | 22.75 | 21.17 | 28.27 | 24.97 | 24.29 |
+----------------------+----------+---------+-------+-------------+-------+
| Training Time | 2307s | 2281s | 2267s | 2510s | 2341s |
+----------------------+----------+---------+-------+-------------+-------+
......@@ -13,13 +13,6 @@ for the radiance field and a 4-layer-MLP for the warping field. The only major d
we reduce the max frequency of the positional encoding from 10 to 4, to respect the fact that the
motion of the object is relatively smooth.
.. note::
The :class:`nerfacc.OccGridEstimator` used in this example is shared by all the frames. In other words,
instead of using it to indicate the opacity of an area at a single timestamp,
Here we use it to indicate the `maximum` opacity at this area `over all the timestamps`.
It is not optimal but still makes the rendering very efficient.
Benchmarks: D-NeRF Dataset
---------------------------
......@@ -32,7 +25,7 @@ The training memory footprint is about 11GB.
| PSNR | bouncing | hell | hook | jumping | lego | mutant | standup | trex | MEAN |
| | balls | warrior | | jacks | | | | | |
+======================+==========+=========+=======+=========+=======+========+=========+=======+=======+
| D-Nerf (~ days) | 32.80 | 25.02 | 29.25 | 32.80 | 21.64 | 31.29 | 32.79 | 31.75 | 29.67 |
| D-NeRF (~ days) | 32.80 | 25.02 | 29.25 | 32.80 | 21.64 | 31.29 | 32.79 | 31.75 | 29.67 |
+----------------------+----------+---------+-------+---------+-------+--------+---------+-------+-------+
| Ours (~ 1 hr) | 39.49 | 25.58 | 31.86 | 32.73 | 24.32 | 35.55 | 35.90 | 32.33 | 32.22 |
+----------------------+----------+---------+-------+---------+-------+--------+---------+-------+-------+
......
......@@ -7,29 +7,29 @@ Performance Overview
+----------------------+----------------+----------------------------------+-----------------------+--------------------------+
| Methods | Dataset | Training Time :math:`\downarrow` | PSNR :math:`\uparrow` | LPIPS :math:`\downarrow` |
+======================+================+==================================+=======================+==========================+
| TensoRF `[1]`_ | Tanks&Temple | 18.3min | 28.13 | 0.143 |
| TensoRF `[1]`_ | Tanks&Temple | 19min | 28.11 | 0.167 |
+----------------------+----------------+----------------------------------+-----------------------+--------------------------+
| *+nerfacc (occgrid)* | | 12.6min | 28.10 | 0.150 |
| *+nerfacc (occgrid)* | | 14min | 28.06 | 0.174 |
+----------------------+----------------+----------------------------------+-----------------------+--------------------------+
| TensoRF `[1]`_ | NeRF-Synthetic | 10.6min | 32.52 | 0.047 |
| TensoRF `[1]`_ | NeRF-Synthetic | 10.3min | 32.73 | 0.049 |
+----------------------+----------------+----------------------------------+-----------------------+--------------------------+
| *+nerfacc (occgrid)* | | 6.5min | 32.51 | 0.044 |
| *+nerfacc (occgrid)* | | 7.1min | 32.52 | 0.054 |
+----------------------+----------------+----------------------------------+-----------------------+--------------------------+
| NeRF `[2]`_ | NeRF-Synthetic | 20hours | 31.00 | 0.047 |
| NeRF `[2]`_ | NeRF-Synthetic | days | 31.00 | 0.047 |
+----------------------+----------------+----------------------------------+-----------------------+--------------------------+
| *+nerfacc (occgrid)* | | 52min | 31.55 | 0.072 |
| *+nerfacc (occgrid)* | | 1hr | 31.55 | 0.072 |
+----------------------+----------------+----------------------------------+-----------------------+--------------------------+
| Instant-NGP `[3]`_ | NeRF-Synthetic | 4.4min | 32.35 | |
+----------------------+----------------+----------------------------------+-----------------------+--------------------------+
| *+nerfacc (occgrid)* | | 4.4min | 32.55 | 0.056 |
| *+nerfacc (occgrid)* | | 4.5min | 33.11 | 0.053 |
+----------------------+----------------+----------------------------------+-----------------------+--------------------------+
| *+nerfacc (propnet)* | | 5.2min | 31.40 | 0.064 |
| *+nerfacc (propnet)* | | 4.0min | 31.76 | 0.062 |
+----------------------+----------------+----------------------------------+-----------------------+--------------------------+
| Instant-NGP `[3]`_ | Mip-NeRF 360 | 5.3min | 25.93 | |
+----------------------+----------------+----------------------------------+-----------------------+--------------------------+
| *+nerfacc (occgrid)* | | 5.5min | 26.38 | 0.351 |
| *+nerfacc (occgrid)* | | 5.0min | 26.41 | 0.353 |
+----------------------+----------------+----------------------------------+-----------------------+--------------------------+
| *+nerfacc (propnet)* | | 5.0min | 27.21 | 0.300 |
| *+nerfacc (propnet)* | | 4.9min | 27.58 | 0.292 |
+----------------------+----------------+----------------------------------+-----------------------+--------------------------+
Implementation Details
......
.. _`TensoRF Example`:
TensoRF
====================
More to come...
\ No newline at end of file
In this example we showcase how to plug the nerfacc library into the *official* codebase
of `TensoRF <https://apchenstu.github.io/TensoRF/>`_. See
`our forked repo <https://github.com/liruilong940607/tensorf/tree/f2d350873c54f249e64b6e745919b6a94bf54f1d>`_
for details.
Benchmark: NeRF-Synthetic Dataset
---------------------------------
*updated on 2023-04-04 with nerfacc==0.5.0*
Our experiments are conducted on a single NVIDIA GeForce RTX 2080 Ti.
+-----------------------+-------+-------+---------+-------+-------+-------+-------+-------+-------+
| PSNR | Lego | Mic |Materials| Chair |Hotdog | Ficus | Drums | Ship | MEAN |
| | | | | | | | | | |
+=======================+=======+=======+=========+=======+=======+=======+=======+=======+=======+
| TensoRF | 35.14 | 25.70 | 33.69 | 37.03 | 36.04 | 29.77 | 34.35 | 30.12 | 32.73 |
+-----------------------+-------+-------+---------+-------+-------+-------+-------+-------+-------+
| training time | 504s | 522s | 633s | 648s | 584s | 824s | 464s | 759s | 617s |
+-----------------------+-------+-------+---------+-------+-------+-------+-------+-------+-------+
| Ours (occ) | 35.05 | 25.70 | 33.54 | 36.99 | 35.62 | 29.76 | 34.08 | 29.39 | 32.52 |
+-----------------------+-------+-------+---------+-------+-------+-------+-------+-------+-------+
| training time | 310s | 312s | 463s | 433s | 363s | 750s | 303s | 468s | 425s |
+-----------------------+-------+-------+---------+-------+-------+-------+-------+-------+-------+
Benchmark: Tanks&Temples Dataset
---------------------------------
*updated on 2023-04-04 with nerfacc==0.5.0*
Our experiments are conducted on a single NVIDIA GeForce RTX 2080 Ti.
+-----------------------+-------+-------------+--------+-------+-------+
| PSNR | Barn | Caterpillar | Family | Truck | MEAN |
| | | | | | |
+=======================+=======+=============+========+=======+=======+
| TensoRF | 26.88 | 25.48 | 33.48 | 26.59 | 28.11 |
+-----------------------+-------+-------------+--------+-------+-------+
| training time | 24min | 19min | 15min | 18min | 19min |
+-----------------------+-------+-------------+--------+-------+-------+
| Ours (occ) | 26.74 | 25.64 | 33.16 | 26.70 | 28.06 |
+-----------------------+-------+-------------+--------+-------+-------+
| training time | 19min | 15min | 11min | 13min | 14min |
+-----------------------+-------+-------------+--------+-------+-------+
......@@ -149,9 +149,9 @@ Links:
:caption: Projects
nerfstudio <https://docs.nerf.studio/>
sdfstudio <https://autonomousvision.github.io/sdfstudio/>
instant-nsr-pl <https://github.com/bennyguo/instant-nsr-pl>
.. _`vanilla Nerf`: https://arxiv.org/abs/2003.08934
.. _`Instant-NGP Nerf`: https://arxiv.org/abs/2201.05989
.. _`D-Nerf`: https://arxiv.org/abs/2011.13961
......
......@@ -91,3 +91,17 @@ In our library, :func:`nerfacc.traverse_grids` is a function that requires synch
because it needs to know the size of the output tensor when traversing the grids. As a result,
sampling with :class:`nerfacc.OccGridEstimator` also requires synchronization. But there is
no walkaround in this case so just be aware of it.
Profiling
-----------------------------
There are plenty of tools for profiling. My personal favorite is
`line_profiler <https://github.com/pyutils/line_profiler>`_ which will give you *per-line* runtime
of a function with a simple decorator `@profile`. It is very useful for finding where the bottleneck
is in your code. It is worth to note that due to the asynchronized nature of Pytorch code, you would
need to set `CUDA_LAUNCH_BLOCKING=1` when profiling your code (no matter which profiling tool you are using).
This variable will force CPU-GPU synchronization for every torch function (equavalent to add
`torch.cuda.synchronize()` everywhere), which can reveal the true runtime of each line of code.
And of course, with `CUDA_LAUNCH_BLOCKING=1` you would get slower total runtime, so don't forget to
remove it when you are done profiling.
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment