Unverified Commit 30d975a5 authored by xiabo123's avatar xiabo123 Committed by GitHub
Browse files

Change the block size to 1024 during compilation. (#2460)

* The default block size of rocm is 256.The designed kenel has a block size of 512.Change the block size to 1024 during compilation.

* The default block size of rocm is 256.The designed kenel has a block size of 512.Change the block size to 1024 during compilation.

* The default block size of rocm is 256.The designed kenel has a block size of 512.Change the block size to 1024 during compilation.

* The default block size of rocm is 256.The designed kenel has a block size of 512.Change the block size to 1024 during compilation.
parent 46eb9ec5
......@@ -284,6 +284,9 @@ def get_extensions():
define_macros += [('MMCV_WITH_CUDA', None)]
cuda_args = os.getenv('MMCV_CUDA_ARGS')
extra_compile_args['nvcc'] = [cuda_args] if cuda_args else []
if is_rocm_pytorch and platform.system() != 'Windows':
extra_compile_args['nvcc'] += \
['--gpu-max-threads-per-block=1024']
op_files = glob.glob('./mmcv/ops/csrc/pytorch/*.cpp') + \
glob.glob('./mmcv/ops/csrc/pytorch/cpu/*.cpp') + \
glob.glob('./mmcv/ops/csrc/pytorch/cuda/*.cu') + \
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment