Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
ktransformers
Commits
7adb7281
Commit
7adb7281
authored
Apr 30, 2025
by
Atream
Browse files
fix-cache-lens
parent
8ba7e5d4
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
1 deletion
+6
-1
ktransformers/server/args.py
ktransformers/server/args.py
+6
-1
No files found.
ktransformers/server/args.py
View file @
7adb7281
import
argparse
from
ktransformers.server.backend.args
import
ConfigArgs
,
default_args
from
ktransformers.util.utils
import
get_free_ports
from
transformers
import
AutoConfig
class
ArgumentParser
:
def
__init__
(
self
,
cfg
):
...
...
@@ -138,7 +139,11 @@ class ArgumentParser:
self
.
cfg
.
server_port
=
args
.
port
self
.
cfg
.
user_force_think
=
args
.
force_think
args
.
gpu_memory_size
=
4
*
1024
*
1024
*
1024
# TODO: set this to the actual GPU memory size
model_config
=
AutoConfig
.
from_pretrained
(
args
.
model_dir
,
trust_remote_code
=
True
)
if
args
.
architectures
==
"Qwen3MoeForCausalLM"
or
args
.
architectures
==
"Qwen2MoeForCausalLM"
:
args
.
gpu_memory_size
=
args
.
cache_lens
*
2
*
2
*
model_config
.
num_hidden_layers
*
model_config
.
num_key_value_heads
*
model_config
.
head_dim
else
:
args
.
gpu_memory_size
=
args
.
cache_lens
*
2
*
576
*
61
self
.
cfg
.
gpu_memory_size
=
args
.
gpu_memory_size
free_ports
=
get_free_ports
(
3
,
[
args
.
port
])
args
.
sched_port
=
free_ports
[
0
]
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment