Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
tsoc
superbenchmark
Commits
ea2c10ab
Unverified
Commit
ea2c10ab
authored
Feb 20, 2022
by
Yifan Xiong
Committed by
GitHub
Feb 20, 2022
Browse files
Config - Add T4 configurations for inference (#311)
Add T4 configurations for inference.
parent
97ed12f9
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
317 additions
and
0 deletions
+317
-0
superbench/config/azure/inference/nc64as_t4_v3.yaml
superbench/config/azure/inference/nc64as_t4_v3.yaml
+317
-0
No files found.
superbench/config/azure/inference/nc64as_t4_v3.yaml
0 → 100644
View file @
ea2c10ab
version
:
v0.4
superbench
:
enable
:
null
monitor
:
enable
:
true
sample_duration
:
1
sample_interval
:
10
var
:
default_local_mode
:
&default_local_mode
enable
:
true
modes
:
-
name
:
local
proc_num
:
4
prefix
:
CUDA_VISIBLE_DEVICES={proc_rank}
parallel
:
yes
default_pytorch_mode
:
&default_pytorch_mode
enable
:
true
modes
:
-
name
:
torch.distributed
proc_num
:
4
node_num
:
1
frameworks
:
-
pytorch
offline_inference_config
:
&offline_inference_config
duration
:
0
num_warmup
:
64
num_steps
:
2048
sample_count
:
8192
batch_size
:
32
precision
:
-
float32
-
float16
model_action
:
-
inference
pin_memory
:
yes
online_inference_config
:
&online_inference_config
duration
:
0
num_warmup
:
64
num_steps
:
2048
sample_count
:
8192
batch_size
:
1
precision
:
-
float32
-
float16
model_action
:
-
inference
pin_memory
:
yes
benchmarks
:
kernel-launch
:
<<
:
*default_local_mode
gemm-flops
:
<<
:
*default_local_mode
parameters
:
precision
:
[
fp32
,
fp16
,
fp16_tc
]
nccl-bw
:
enable
:
true
modes
:
-
name
:
local
proc_num
:
1
parallel
:
no
parameters
:
ngpus
:
4
maxbytes
:
4G
cpu-memory-bw-latency
:
enable
:
true
modes
:
-
name
:
local
proc_num
:
1
parallel
:
no
parameters
:
tests
:
-
bandwidth_matrix
-
latency_matrix
-
max_bandwidth
mem-bw
:
enable
:
true
modes
:
-
name
:
local
proc_num
:
4
prefix
:
CUDA_VISIBLE_DEVICES={proc_rank} numactl -N {proc_rank}
parallel
:
no
gpu-copy-bw
:
enable
:
true
modes
:
-
name
:
local
parallel
:
no
parameters
:
mem_type
:
-
htod
-
dtoh
copy_type
:
-
sm
-
dma
cudnn-function
:
<<
:
*default_local_mode
cublas-function
:
<<
:
*default_local_mode
matmul
:
<<
:
*default_local_mode
frameworks
:
-
pytorch
sharding-matmul
:
<<
:
*default_pytorch_mode
computation-communication-overlap
:
<<
:
*default_pytorch_mode
ort-inference:fp16-offline:
<<
:
*default_local_mode
parameters
:
pytorch_models
:
-
resnet50
-
resnet101
-
resnet152
-
densenet169
-
densenet201
batch_size
:
32
precision
:
float16
ort-inference:fp16-online:
<<
:
*default_local_mode
parameters
:
pytorch_models
:
-
resnet50
-
resnet101
-
resnet152
-
densenet169
-
densenet201
batch_size
:
1
precision
:
float16
tensorrt-inference:fp16-offline:
<<
:
*default_local_mode
parameters
:
pytorch_models
:
-
resnet50
-
resnet101
-
resnet152
-
densenet169
-
densenet201
-
bert-base
# - bert-large
seq_length
:
224
batch_size
:
32
precision
:
fp16
tensorrt-inference:fp16-online:
<<
:
*default_local_mode
parameters
:
pytorch_models
:
-
resnet50
-
resnet101
-
resnet152
-
densenet169
-
densenet201
-
bert-base
-
bert-large
seq_length
:
224
batch_size
:
1
precision
:
fp16
tensorrt-inference:int8-offline:
<<
:
*default_local_mode
parameters
:
pytorch_models
:
-
resnet50
-
resnet101
-
resnet152
-
densenet169
-
densenet201
-
bert-base
# - bert-large
seq_length
:
224
batch_size
:
32
precision
:
int8
tensorrt-inference:int8-online:
<<
:
*default_local_mode
parameters
:
pytorch_models
:
-
resnet50
-
resnet101
-
resnet152
-
densenet169
-
densenet201
-
bert-base
-
bert-large
seq_length
:
224
batch_size
:
1
precision
:
int8
# PyTorch offline inference
model-benchmarks:gpt-offline:
<<
:
*default_local_mode
frameworks
:
-
pytorch
models
:
-
gpt2-small
-
gpt2-large
parameters
:
<<
:
*offline_inference_config
batch_size
:
8
seq_len
:
224
model-benchmarks:bert-offline:
<<
:
*default_local_mode
frameworks
:
-
pytorch
models
:
-
bert-base
-
bert-large
parameters
:
<<
:
*offline_inference_config
seq_len
:
224
model-benchmarks:lstm-offline:
<<
:
*default_local_mode
frameworks
:
-
pytorch
models
:
-
lstm
parameters
:
<<
:
*offline_inference_config
batch_size
:
224
input_size
:
224
hidden_size
:
1000
seq_len
:
32
pin_memory
:
no
model-benchmarks:resnet-offline:
<<
:
*default_local_mode
frameworks
:
-
pytorch
models
:
-
resnet50
-
resnet101
-
resnet152
parameters
:
<<
:
*offline_inference_config
batch_size
:
192
num_steps
:
512
model-benchmarks:densenet-offline:
<<
:
*default_local_mode
frameworks
:
-
pytorch
models
:
-
densenet169
-
densenet201
parameters
:
<<
:
*offline_inference_config
pin_memory
:
no
model-benchmarks:vgg-offline:
<<
:
*default_local_mode
frameworks
:
-
pytorch
models
:
-
vgg11
-
vgg13
-
vgg16
-
vgg19
parameters
:
<<
:
*offline_inference_config
pin_memory
:
no
# PyTorch online inference
model-benchmarks:gpt-online:
<<
:
*default_local_mode
frameworks
:
-
pytorch
models
:
-
gpt2-small
-
gpt2-large
parameters
:
<<
:
*online_inference_config
seq_len
:
224
model-benchmarks:bert-online:
<<
:
*default_local_mode
frameworks
:
-
pytorch
models
:
-
bert-base
-
bert-large
parameters
:
<<
:
*online_inference_config
seq_len
:
224
model-benchmarks:lstm-online:
<<
:
*default_local_mode
frameworks
:
-
pytorch
models
:
-
lstm
parameters
:
<<
:
*online_inference_config
input_size
:
224
hidden_size
:
1000
seq_len
:
32
pin_memory
:
no
model-benchmarks:resnet-online:
<<
:
*default_local_mode
frameworks
:
-
pytorch
models
:
-
resnet50
-
resnet101
-
resnet152
parameters
:
<<
:
*online_inference_config
model-benchmarks:densenet-online:
<<
:
*default_local_mode
frameworks
:
-
pytorch
models
:
-
densenet169
-
densenet201
parameters
:
<<
:
*online_inference_config
pin_memory
:
no
model-benchmarks:vgg-online:
<<
:
*default_local_mode
frameworks
:
-
pytorch
models
:
-
vgg11
-
vgg13
-
vgg16
-
vgg19
parameters
:
<<
:
*online_inference_config
pin_memory
:
no
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment