README.md 2.66 KB
Newer Older
Ruilong Li's avatar
readme  
Ruilong Li committed
1
2
# nerfacc

3
This is a **tiny** tootlbox  for **accelerating** NeRF training & rendering using PyTorch CUDA extensions. Plug-and-play for most of the NeRFs!
Ruilong Li's avatar
readme  
Ruilong Li committed
4

Ruilong Li's avatar
Ruilong Li committed
5
## Examples: Instant-NGP NeRF
Ruilong Li's avatar
readme  
Ruilong Li committed
6

Ruilong Li's avatar
Ruilong Li committed
7
8
``` bash
python examples/trainval.py ngp --train_split trainval
Ruilong Li's avatar
readme  
Ruilong Li committed
9
10
```

Ruilong Li's avatar
Ruilong Li committed
11
Performance on TITAN RTX :
12

Ruilong Li(李瑞龙)'s avatar
Ruilong Li(李瑞龙) committed
13
14
15
16
17
| trainval | Lego | Mic | Materials | Chair | Hotdog |
| - | - | - | - | - | - |
| Time | 300s  | 272s  | 258s  | 331s  | 287s |
| PSNR | 36.61 | 37.45 | 30.15 | 36.06 | 38.17 |
| FPS  | 11.49 | 21.48 | 8.86  | 15.61 | 7.38 |
Ruilong Li's avatar
Ruilong Li committed
18

Ruilong Li(李瑞龙)'s avatar
Ruilong Li(李瑞龙) committed
19
Instant-NGP paper (5 min) on 3090 (w/ mask):
20

Ruilong Li(李瑞龙)'s avatar
Ruilong Li(李瑞龙) committed
21
22
23
| trainval | Lego | Mic | Materials | Chair | Hotdog |
| - | - | - | - | - | - |
| PSNR | 36.39 | 36.22 | 29.78 | 35.00 | 37.40 |
24
25


Ruilong Li's avatar
Ruilong Li committed
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
## Examples: Vanilla MLP NeRF

``` bash
python examples/trainval.py vanilla --train_split train
```

Performance on test set:

|  | Lego |
| - | - |
| Paper PSNR (train set) | 32.54 |
| Our PSNR (train set) | 33.21 |
| Our PSNR (trainval set) | 33.66  |
| Our train time & test FPS | 45min; 0.43FPS |

For reference, vanilla NeRF paper trains on V100 GPU for 1-2 days per scene. Test time rendering takes about 30 secs to render a 800x800 image. Our model is trained on a TITAN X.

Note: We only use a single MLP with more samples (1024), instead of two MLPs with coarse-to-fine sampling as in the paper. Both ways share the same spirit to do dense sampling around the surface. Our fast rendering inheritly skip samples away from the surface so we can simplly increase the number of samples with a single MLP, to achieve the same goal with coarse-to-fine sampling, without runtime or memory issue.

45
<!-- 
Ruilong Li's avatar
Ruilong Li committed
46

Ruilong Li's avatar
Ruilong Li committed
47
Tested with the default settings on the Lego test set.
Ruilong Li's avatar
readme  
Ruilong Li committed
48

Ruilong Li's avatar
Ruilong Li committed
49
50
| Model | Split | PSNR | Train Time | Test Speed | GPU | Train Memory |
| - | - | - | - | - | - | - |
Ruilong Li's avatar
readme  
Ruilong Li committed
51
| instant-ngp (paper)            | trainval?            | 36.39  |  -   | -    | 3090    |
Ruilong Li's avatar
Ruilong Li committed
52
| instant-ngp (code)             | train (35k steps)    | 36.08  |  308 sec  | 55.32 fps  | TITAN RTX  |  1734MB |
Ruilong Li's avatar
Ruilong Li committed
53
| instant-ngp (code) w/o rng bkgd| train (35k steps)    | 34.17  |  -  | -  | - |  - |
Ruilong Li's avatar
Ruilong Li committed
54
| torch-ngp (`-O`)               | train (30K steps)    | 34.15  |  310 sec  | 7.8 fps    | V100 |
55
| ours                           | trainval (35K steps) | 36.22  |  378 sec  | 12.08 fps    | TITAN RTX  | -->
Ruilong Li's avatar
Ruilong Li committed
56
57
58
59
60
61

## Tips:

1. sample rays over all images per iteration (`batch_over_images=True`) is better: `PSNR 33.31 -> 33.75`.
2. make use of scheduler (`MultiStepLR(optimizer, milestones=[20000, 30000], gamma=0.1)`) to adjust learning rate gives: `PSNR 33.75 -> 34.40`.
3. increasing chunk size (`chunk: 8192 -> 81920`) during inference gives speedup: `FPS 4.x -> 6.2`
Ruilong Li's avatar
Ruilong Li committed
62
4. random bkgd color (`color_bkgd_aug="random"`) for the `Lego` scene actually hurts: `PNSR 35.42 -> 34.38`
Ruilong Li's avatar
Ruilong Li committed
63