Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
tsoc
superbenchmark
Commits
b8b080e2
Commit
b8b080e2
authored
Apr 02, 2026
by
one
Browse files
Update docs
parent
04564997
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
15 additions
and
10 deletions
+15
-10
docs/user-tutorial/benchmarks/micro-benchmarks.md
docs/user-tutorial/benchmarks/micro-benchmarks.md
+8
-6
docs/user-tutorial/data-diagnosis.md
docs/user-tutorial/data-diagnosis.md
+3
-2
docs/user-tutorial/result-summary.md
docs/user-tutorial/result-summary.md
+4
-2
No files found.
docs/user-tutorial/benchmarks/micro-benchmarks.md
View file @
b8b080e2
...
@@ -10,15 +10,17 @@ id: micro-benchmarks
...
@@ -10,15 +10,17 @@ id: micro-benchmarks
#### Introduction
#### Introduction
Measure GPU kernel launch latency,
Measure GPU kernel launch
performance from multiple perspectives, including end-to-end
latency,
which is defined as the time range from the beginning of the launch API call to the beginning of the kernel execution
.
host-side dispatch overhead, steady-state launch throughput, and device-side launch time
.
#### Metrics
#### Metrics
| Name | Unit | Description |
| Name | Unit | Description |
|--------------------------|-----------|--------------------------------------|
|-------------------------------------|--------------------|------------------------------------------------------------------|
| kernel-launch/event_time | time (ms) | Launch latency measured in GPU time. |
| kernel-launch/e2e_latency_us | time (us) | Single-shot end-to-end latency measured in CPU time. |
| kernel-launch/wall_time | time (ms) | Launch latency measured in CPU time. |
| kernel-launch/host_dispatch_us | time (us) | Host-side dispatch overhead per kernel measured in CPU time. |
| kernel-launch/launch_throughput_mkps| throughput (MKPS) | Steady-state kernel launch throughput. |
| kernel-launch/device_launch_us | time (us) | Device-side average launch time per kernel measured by events. |
### `gemm-flops`
### `gemm-flops`
...
...
docs/user-tutorial/data-diagnosis.md
View file @
b8b080e2
...
@@ -83,8 +83,9 @@ superbench:
...
@@ -83,8 +83,9 @@ superbench:
criteria
:
lambda x:x>0.05
criteria
:
lambda x:x>0.05
categories
:
KernelLaunch
categories
:
KernelLaunch
metrics
:
metrics
:
-
kernel-launch/event_time:\d+
-
kernel-launch/e2e_latency_us:\d+
-
kernel-launch/wall_time:\d+
-
kernel-launch/host_dispatch_us:\d+
-
kernel-launch/device_launch_us:\d+
rule1
:
rule1
:
# Rule 1: If H2D_Mem_BW or D2H_Mem_BW test suffers > 5% downgrade, label it as defective
# Rule 1: If H2D_Mem_BW or D2H_Mem_BW test suffers > 5% downgrade, label it as defective
function
:
variance
function
:
variance
...
...
docs/user-tutorial/result-summary.md
View file @
b8b080e2
...
@@ -70,8 +70,10 @@ superbench:
...
@@ -70,8 +70,10 @@ superbench:
aggregate: True
aggregate: True
categories: KernelLaunch
categories: KernelLaunch
metrics:
metrics:
- kernel-launch/event_time
- kernel-launch/e2e_latency_us
- kernel-launch/wall_time
- kernel-launch/host_dispatch_us
- kernel-launch/launch_throughput_mkps
- kernel-launch/device_launch_us
nccl:
nccl:
statistics: mean
statistics: mean
categories: NCCL
categories: NCCL
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment