Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
45d5af24
Unverified
Commit
45d5af24
authored
Oct 21, 2024
by
sixgod
Committed by
GitHub
Oct 21, 2024
Browse files
Add GLM-4 TextGeneration Model support for SGLang (#1736)
parent
b121bc03
Changes
3
Show whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
5 additions
and
3 deletions
+5
-3
README.md
README.md
+1
-0
python/sglang/srt/models/chatglm.py
python/sglang/srt/models/chatglm.py
+3
-3
test/srt/models/test_generation_models.py
test/srt/models/test_generation_models.py
+1
-0
No files found.
README.md
View file @
45d5af24
...
...
@@ -303,6 +303,7 @@ You can view the full example [here](https://github.com/sgl-project/sglang/tree/
-
MiniCPM / MiniCPM 3
-
XVERSE / XVERSE MoE
-
SmolLM
-
GLM-4
**Embedding Models**
...
...
python/sglang/srt/models/chatglm.py
View file @
45d5af24
...
...
@@ -303,7 +303,7 @@ class GLMTransformer(nn.Module):
return
hidden_states
class
ChatGLMM
odel
(
nn
.
Module
):
class
ChatGLMM
(
nn
.
Module
):
def
__init__
(
self
,
config
,
...
...
@@ -366,7 +366,7 @@ class ChatGLMForCausalLM(nn.Module):
self
.
config
:
ChatGLMConfig
=
config
self
.
quant_config
=
quant_config
self
.
max_position_embeddings
=
getattr
(
config
,
"max_sequence_length"
,
8192
)
self
.
transformer
=
ChatGLMM
odel
(
config
,
cache_config
,
quant_config
)
self
.
transformer
=
ChatGLMM
(
config
,
cache_config
,
quant_config
)
self
.
lm_head
=
self
.
transformer
.
output_layer
self
.
logits_processor
=
LogitsProcessor
(
config
)
...
...
@@ -401,4 +401,4 @@ class ChatGLMModel(ChatGLMForCausalLM):
pass
EntryClass
=
[
ChatGLMForCausalLM
,
ChatGLMModel
]
EntryClass
=
[
ChatGLMModel
]
test/srt/models/test_generation_models.py
View file @
45d5af24
...
...
@@ -57,6 +57,7 @@ ALL_OTHER_MODELS = [
ModelCase
(
"Qwen/Qwen2.5-14B-Instruct"
),
ModelCase
(
"HuggingFaceTB/SmolLM-135M-Instruct"
,
skip_long_prompt
=
True
),
ModelCase
(
"allenai/OLMo-1B-0724-hf"
,
decode_tolerance
=
8e-2
,
skip_long_prompt
=
True
),
ModelCase
(
"THUDM/glm-4-9b-chat"
),
]
TORCH_DTYPES
=
[
torch
.
float16
]
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment