PERFORMANCE_GUIDE.md 1.43 KB
Newer Older
yan.yan's avatar
yan.yan committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<!--
 Copyright 2021 Yan Yan
 
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at
 
     http://www.apache.org/licenses/LICENSE-2.0
 
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
-->

yan.yan's avatar
v2.1  
yan.yan committed
17
18
19
20
# Spconv 2.x Performance Guide

## Short Guide

yan.yan's avatar
yan.yan committed
21
* If you train without Tensor Core (i.e. FP32 training), set all ```algo``` in convolution/maxpool to ```ConvAlgo.Native``` manually. Default Algorithm is ```ConvAlgo.MaskImplicitGemm```, which is **SLOWER** than ```ConvAlgo.Native``` when use float32. this will be fixed in spconv 2.2.
yan.yan's avatar
v2.1  
yan.yan committed
22
23
24
25
* If your GPU support Tensor Core, use FP16 (mixed precision training) if possible. 
* If you train with mixed precision training (use Tensor Core), you don't need to set algorithm manually.
* Currently fast algorithm only support kernel volume (prod of kernel size) <= 32, so don't use large kernel size.
* make sure your channel size is multiple of 8 when using fp16. multiple of 32 is better.
yan.yan's avatar
yan.yan committed
26
* spconv 2.x in Windows 10 is 1.5x~2x slower than Linux. use Linux if possible.
yan.yan's avatar
v2.1  
yan.yan committed
27

28
See [benchmark](BENCHMARK.md) for more performance details of different algorithms.