Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
dynamo
Commits
b98188c8
"deploy/cloud/vscode:/vscode.git/clone" did not exist on "af3d8aa08957bbcf4a07f2a79cce7631cfe25a7e"
Unverified
Commit
b98188c8
authored
Aug 21, 2025
by
Hongkuan Zhou
Committed by
GitHub
Aug 21, 2025
Browse files
fix: profiling script missing tests when kv cache is tight (#2567)
parent
19f8eb00
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
14 additions
and
7 deletions
+14
-7
benchmarks/profiler/utils/profile_decode.py
benchmarks/profiler/utils/profile_decode.py
+14
-7
No files found.
benchmarks/profiler/utils/profile_decode.py
View file @
b98188c8
...
...
@@ -42,13 +42,20 @@ def profile_decode(
(
max_context_length
-
osl
)
//
interpolation_granularity
,
):
max_concurrency
=
max_kv_tokens
//
(
isl
+
osl
)
if
max_concurrency
//
interpolation_granularity
==
0
:
if
max_concurrency
==
0
:
logger
.
warning
(
f
"max_kv_tokens
{
max_kv_tokens
}
is too small for"
f
" isl
{
isl
}
+ osl
{
osl
}
, skipping."
)
break
elif
max_concurrency
<
interpolation_granularity
:
logger
.
warning
(
f
"max_concurrency
{
max_concurrency
}
is too small for"
f
" interpolation granularity
{
interpolation_granularity
}
."
f
" max_kv_tokens
{
max_kv_tokens
}
, isl
{
isl
}
, osl
{
osl
}
"
)
break
sweep_num_request
=
range
(
1
,
max_concurrency
+
1
)
else
:
sweep_num_request
=
range
(
1
,
max_concurrency
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment