Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
Qwen_lmdeploy
Commits
77a26812
Unverified
Commit
77a26812
authored
Oct 13, 2023
by
Chen Xin
Committed by
GitHub
Oct 13, 2023
Browse files
Add tp hint for deployment (#555)
* add tp hint for deploy * fix lint * assert tp in turbomind * fix lint
parent
6904053f
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
4 additions
and
1 deletion
+4
-1
lmdeploy/serve/turbomind/deploy.py
lmdeploy/serve/turbomind/deploy.py
+3
-1
lmdeploy/turbomind/turbomind.py
lmdeploy/turbomind/turbomind.py
+1
-0
No files found.
lmdeploy/serve/turbomind/deploy.py
View file @
77a26812
...
...
@@ -972,7 +972,7 @@ def main(model_name: str,
META's llama format, and 'hf' means huggingface format
tokenizer_path (str): the path of tokenizer model
dst_path (str): the destination path that saves outputs
tp (int): the number of GPUs used for tensor parallelism
tp (int): the number of GPUs used for tensor parallelism
, should be 2^n
quant_path (str): path of the quantized model, which can be None
group_size (int): a parameter used in AWQ to quantize fp16 weights
to 4 bits
...
...
@@ -981,6 +981,8 @@ def main(model_name: str,
f
"'
{
model_name
}
' is not supported. "
\
f
'The supported models are:
{
MODELS
.
module_dict
.
keys
()
}
'
assert
((
tp
&
(
tp
-
1
)
==
0
)
and
tp
!=
0
),
'tp should be 2^n'
if
model_format
is
None
:
model_format
=
'qwen'
if
model_name
==
'qwen-7b'
else
'hf'
...
...
lmdeploy/turbomind/turbomind.py
View file @
77a26812
...
...
@@ -86,6 +86,7 @@ class TurboMind:
node_num
=
1
# read meta from model path
assert
((
tp
&
(
tp
-
1
)
==
0
)
and
tp
!=
0
),
'tp should be 2^n'
self
.
gpu_count
=
tp
self
.
session_len
=
2048
data_type
=
'fp16'
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment