Commit 16a8cb24 authored by yan.yan's avatar yan.yan
Browse files

update doc

parent 596a3cc0
...@@ -2,7 +2,7 @@ name: 'Close stale issues and PRs' ...@@ -2,7 +2,7 @@ name: 'Close stale issues and PRs'
on: on:
schedule: schedule:
- cron: '30 1 1 * *' - cron: '30 1 1 1 *'
workflow_dispatch: workflow_dispatch:
inputs: inputs:
logLevel: logLevel:
......
...@@ -60,6 +60,8 @@ Check [spconv 2.x algorithm introduction](docs/spconv2_algo.pdf) to understand s ...@@ -60,6 +60,8 @@ Check [spconv 2.x algorithm introduction](docs/spconv2_algo.pdf) to understand s
Use spconv >= cu114 if possible. cuda 11.4 can compile greatly faster kernel in some situation. Use spconv >= cu114 if possible. cuda 11.4 can compile greatly faster kernel in some situation.
Update Spconv: you **MUST UNINSTALL** all spconv/cumm/spconv-cuxxx/cumm-cuxxx first, use ```pip list | grep spconv``` and ```pip list | grep cumm``` to check all installed package. then use pip to install new spconv.
## NEWS ## NEWS
* spconv 2.2: ampere feature support (by [EvernightAurora](https://github.com/EvernightAurora)), pure c++ code generation, nvrtc, drop python 3.6 * spconv 2.2: ampere feature support (by [EvernightAurora](https://github.com/EvernightAurora)), pure c++ code generation, nvrtc, drop python 3.6
...@@ -67,7 +69,7 @@ Use spconv >= cu114 if possible. cuda 11.4 can compile greatly faster kernel in ...@@ -67,7 +69,7 @@ Use spconv >= cu114 if possible. cuda 11.4 can compile greatly faster kernel in
## Spconv 2.2 vs Spconv 2.1 ## Spconv 2.2 vs Spconv 2.1
* faster fp16 conv kernels (~5-30%) in ampere GPUs (tested in RTX 3090) * faster fp16 conv kernels (~5-30%) in ampere GPUs (tested in RTX 3090)
* greatly faster int8 conv kernels (~1.2x~2.7x) in ampere GPUs (tested in RTX 3090) * greatly faster int8 conv kernels (~1.2x-2.7x) in ampere GPUs (tested in RTX 3090)
* drop python 3.6 support * drop python 3.6 support
* nvrtc support: kernel in old GPUs will be compiled in runtime. * nvrtc support: kernel in old GPUs will be compiled in runtime.
* [libspconv](docs/PURE_CPP_BUILD.md): pure c++ build of all spconv ops. see [example](example/libspconv/run_build.sh) * [libspconv](docs/PURE_CPP_BUILD.md): pure c++ build of all spconv ops. see [example](example/libspconv/run_build.sh)
......
...@@ -18,25 +18,29 @@ ...@@ -18,25 +18,29 @@
### Network Benchmark without batchnorm (TF32/F16) in Different GPUs ### Network Benchmark without batchnorm (TF32/F16) in Different GPUs
Basic: ```python -m spconv.benchmark bench_basic f16``` and ```python -m spconv.benchmark bench_basic tf32``` Basic (120k voxels): ```python -m spconv.benchmark bench_basic f16``` and ```python -m spconv.benchmark bench_basic tf32```
| GPUs | F16-Forward | F16-Backward | TF32-Forward | TF32-Backward | | GPUs | F16-Forward | F16-Backward | TF32-Forward | TF32-Backward |
| -------------- |:---------------------:|---------------------:|---------------------:| ---------------------:| | -------------- |:---------------------:|---------------------:|---------------------:| ---------------------:|
| T4 | 18.74 | 25.51 | N/A | N/A | | T4 | 18.74 | 25.51 | N/A | N/A |
| RTX 3080 Laptop (150W) | 8.2 | 11.51 | 15.04 | 26.90 | | RTX 3080 Laptop (150W) | 8.2 | 11.51 | 15.04 | 26.90 |
| A100 | 13.02 | 12.43 | 12.35 | 14.93 | | A100 | 13.02 | 12.43 | 12.35 | 14.93 |
| RTX3090 | 11.84 | 11.84 | 13.23 | 15.79 | | RTX 3090 | 11.84 | 11.84 | 13.23 | 15.79 |
| RTX A6000 | 11.11 | 8.97 | 12.30 | 12.79 | | RTX A6000 | 11.11 | 8.97 | 12.30 | 12.79 |
| TESLA V100-32G | 15.55 | 14.90 | N/A | N/A |
| TESLA V100-16G | 10.61 | 13.91 | N/A | N/A |
Large: ```python -m spconv.benchmark bench_large f16``` and ```python -m spconv.benchmark bench_large tf32``` Large (900k voxels): ```python -m spconv.benchmark bench_large f16``` and ```python -m spconv.benchmark bench_large tf32```
| GPUs | F16-Forward | F16-Backward | TF32-Forward | TF32-Backward | | GPUs | F16-Forward | F16-Backward | TF32-Forward | TF32-Backward |
| -------------- |:---------------------:|---------------------:|---------------------:| ---------------------:| | -------------- |:---------------------:|---------------------:|---------------------:| ---------------------:|
| T4 | 128.7 | 203.3 | N/A | N/A | | T4 | 128.7 | 203.3 | N/A | N/A |
| RTX 3080 Laptop (150W) | 43.15 | 74.57 | 84.65 | 165.19 | | RTX 3080 Laptop (150W) | 43.15 | 74.57 | 84.65 | 165.19 |
| A100 | 19.85 | 31.24 | 29.58 | 55.63 | | A100 | 19.85 | 31.24 | 29.58 | 55.63 |
| RTX3090 | 27.83 | 40.45 | 44.51 | 73.17 | | RTX 3090 | 27.83 | 40.45 | 44.51 | 73.17 |
| RTX A6000 | 28.62 | 39.86 | 45.43 | 74.11 | | RTX A6000 | 28.62 | 39.86 | 45.43 | 74.11 |
| TESLA V100-32G | 50.37 | 72.99 | N/A | N/A |
| TESLA V100-16G | 38.65 | 61.47 | N/A | N/A |
**NOTE** **NOTE**
......
## libspconv Example ## libspconv Example
run ```run_build.sh``` to get ```libspconv.so```. run ```run_build.sh``` to get ```libspconv.so```.
\ No newline at end of file
## libspconv API
currently not available, but you can check python code to understand how to use C++ apis, spconv python and libspconv use same c++ code.
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment