Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
32d9e39a
Unverified
Commit
32d9e39a
authored
Aug 06, 2025
by
kk
Committed by
GitHub
Aug 05, 2025
Browse files
Fix potential memory fault issue and ncclSystemError in CI test (#8681)
Co-authored-by:
wunhuang
<
wunhuang@amd.com
>
parent
4f4e0e41
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
7 additions
and
9 deletions
+7
-9
.github/workflows/pr-test-amd.yml
.github/workflows/pr-test-amd.yml
+1
-1
python/sglang/srt/layers/attention/aiter_backend.py
python/sglang/srt/layers/attention/aiter_backend.py
+5
-8
scripts/amd_ci_start_container.sh
scripts/amd_ci_start_container.sh
+1
-0
No files found.
.github/workflows/pr-test-amd.yml
View file @
32d9e39a
...
...
@@ -291,7 +291,7 @@ jobs:
bash scripts/amd_ci_exec.sh python3 run_suite.py --suite per-commit-8-gpu-amd --timeout-per-file 3600
-
name
:
Run CustomAllReduce test
timeout-minutes
:
1
0
timeout-minutes
:
2
0
run
:
|
bash scripts/amd_ci_exec.sh -e CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python3 -m unittest test_custom_allreduce.TestCustomAllReduce
...
...
python/sglang/srt/layers/attention/aiter_backend.py
View file @
32d9e39a
...
...
@@ -720,11 +720,6 @@ class AiterIndicesUpdaterPrefill:
self
.
req_to_token
=
model_runner
.
req_to_token_pool
.
req_to_token
self
.
update
=
self
.
update_single_wrapper
# get the last index of the pool
self
.
pool_size
=
(
model_runner
.
token_to_kv_pool
.
size
+
model_runner
.
token_to_kv_pool
.
page_size
)
-
1
self
.
kv_indices
=
None
self
.
max_q_len
=
0
self
.
max_kv_len
=
0
...
...
@@ -769,9 +764,8 @@ class AiterIndicesUpdaterPrefill:
# but the 0 location will be made nan (noqa) in cuda graph capture mode
# this will cause the output tensor value becomes nan
# WA is to assure that last index of pool not changed
kv_indices
=
torch
.
full
(
(
paged_kernel_lens_sum
+
128
,),
self
.
pool_size
,
kv_indices
=
torch
.
empty
(
paged_kernel_lens_sum
+
256
,
dtype
=
torch
.
int32
,
device
=
req_pool_indices
.
device
,
)
...
...
@@ -785,6 +779,9 @@ class AiterIndicesUpdaterPrefill:
self
.
req_to_token
.
shape
[
1
],
)
token_num
=
kv_indptr
[
-
1
]
kv_indices
[
token_num
:]
=
kv_indices
[
0
]
self
.
max_kv_len
=
torch
.
max
(
paged_kernel_lens
).
item
()
extend_lens
=
seq_lens
-
prefix_lens
...
...
scripts/amd_ci_start_container.sh
View file @
32d9e39a
...
...
@@ -124,6 +124,7 @@ echo "Starting container: ci_sglang"
docker run
-dt
--user
root
--device
=
/dev/kfd
$DEVICE_FLAG
\
-v
"
${
GITHUB_WORKSPACE
:-
$PWD
}
:/sglang-checkout"
\
--ipc
=
host
--group-add
video
\
--shm-size
32g
\
--cap-add
=
SYS_PTRACE
\
-e
HF_TOKEN
=
"
${
HF_TOKEN
:-}
"
\
--security-opt
seccomp
=
unconfined
\
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment