Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
fbcbb263
"vscode:/vscode.git/clone" did not exist on "f1506916720c1bb69518e3d281dec9dc1b5181c3"
Unverified
Commit
fbcbb263
authored
Oct 23, 2024
by
Lianmin Zheng
Committed by
GitHub
Oct 23, 2024
Browse files
Fix perf regression for set_kv_buffer (#1765)
parent
2fce449b
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
11 additions
and
9 deletions
+11
-9
python/sglang/srt/mem_cache/memory_pool.py
python/sglang/srt/mem_cache/memory_pool.py
+11
-9
No files found.
python/sglang/srt/mem_cache/memory_pool.py
View file @
fbcbb263
...
...
@@ -221,17 +221,19 @@ class MHATokenToKVPool(BaseTokenToKVPool):
cache_v
:
torch
.
Tensor
,
):
layer_id
=
layer
.
layer_id
copy_two_array
(
loc
,
self
.
k_buffer
[
layer_id
],
cache_k
,
self
.
v
_buffer
[
layer_id
]
,
cache_v
,
self
.
dtype
,
self
.
store_dtype
,
)
if
cache_k
.
dtype
!=
self
.
dtype
:
cache_k
=
cache_k
.
to
(
self
.
dtype
)
cache_v
=
cache_v
.
to
(
self
.
dtype
)
if
self
.
store_dtype
!=
self
.
dtype
:
self
.
k
_buffer
[
layer_id
]
[
loc
]
=
cache_k
.
view
(
self
.
store_dtype
)
self
.
v_buffer
[
layer_id
][
loc
]
=
cache_v
.
view
(
self
.
store_dtype
)
else
:
self
.
k_buffer
[
layer_id
][
loc
]
=
cache_k
self
.
v_buffer
[
layer_id
][
loc
]
=
cache_v
# This compiled version is slower in the unit test
# python3 -m unittest test_bench_serving.TestBenchServing.test_offline_throughput_non_stream_small_batch_size
@
torch
.
compile
(
dynamic
=
True
)
def
copy_two_array
(
loc
,
dst_1
,
src_1
,
dst_2
,
src_2
,
dtype
,
store_dtype
):
dst_1
[
loc
]
=
src_1
.
to
(
dtype
).
view
(
store_dtype
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment