Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
ktransformers
Commits
e8e83308
Commit
e8e83308
authored
May 14, 2025
by
qiyuxinlin
Browse files
fix flashinfer float_workspace_buffer small
parent
02948bc1
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
2 deletions
+2
-2
ktransformers/server/backend/interfaces/balance_serve.py
ktransformers/server/backend/interfaces/balance_serve.py
+2
-2
No files found.
ktransformers/server/backend/interfaces/balance_serve.py
View file @
e8e83308
...
...
@@ -195,13 +195,13 @@ class Engine:
self
.
block_num
=
inference_context
.
k_cache
[
0
].
size
(
1
)
self
.
model_runner
=
ModelRunner
(
self
.
model
,
self
.
device
,
self
.
args
.
use_cuda_graph
,
page_size
=
args
.
page_size
,
block_num
=
self
.
block_num
)
#@TODO add config
if
config
.
architectures
[
0
]
==
"Qwen2MoeForCausalLM"
or
config
.
architectures
[
0
]
==
"Qwen3MoeForCausalLM"
:
self
.
model
.
init_wrapper
(
self
.
args
.
use_cuda_graph
,
self
.
device
,
Config
().
chunk_size
,
args
.
max_batch_size
,
self
.
block_num
)
# TODO: 1024 is a magic number(max_batch_tokens)
self
.
model
.
init_wrapper
(
self
.
args
.
use_cuda_graph
,
self
.
device
,
max
(
self
.
model_runner
.
cuda_graphs
)
,
args
.
max_batch_size
,
self
.
block_num
)
else
:
self
.
model
.
init_wrapper
(
self
.
args
.
use_cuda_graph
,
self
.
device
,
args
.
max_batch_size
,
self
.
block_num
)
self
.
model_runner
=
ModelRunner
(
self
.
model
,
self
.
device
,
self
.
args
.
use_cuda_graph
,
page_size
=
args
.
page_size
,
block_num
=
self
.
block_num
)
self
.
sampler
=
Sampler
()
self
.
query_manager
=
QueryManager
(
device
=
self
.
device
,
page_size
=
args
.
page_size
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment