Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
MiniMax-M2_vllm
Commits
9d2097be
Commit
9d2097be
authored
Jan 16, 2026
by
chenych
Browse files
Add minimax-m2.1
parent
9cf41349
Changes
5
Show whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
19 additions
and
4 deletions
+19
-4
README.md
README.md
+18
-3
doc/quant.png
doc/quant.png
+0
-0
doc/result-minimax-m2.png
doc/result-minimax-m2.png
+0
-0
doc/result-minimax-m2_1.png
doc/result-minimax-m2_1.png
+0
-0
model.properties
model.properties
+1
-1
No files found.
README.md
View file @
9d2097be
...
...
@@ -46,6 +46,8 @@ docker run -it --shm-size 60g --network=host --name minimax_m2 --privileged --de
暂无
## 推理
> 以 MiniMax-M2 为例, MiniMax-M2.1模型同理
1.
将FP8模型权重转换成BF16,转换方法如下:
```
bash
...
...
@@ -53,7 +55,7 @@ python cast_model_dtype/fp8_cast_bf16.py --input-fp8-hf-path /path/of/MiniMax/Mi
```
2.
相关模型文件拷贝:
```
bash
cp
config.json /path/of/MiniMax/MiniMax-M2-bf16
cp
/path/of/MiniMax/MiniMax-M2/
config.json /path/of/MiniMax/MiniMax-M2-bf16
cp
/path/of/MiniMax/MiniMax-M2/chat_template.jinja /path/of/MiniMax/MiniMax-M2-bf16
cp
/path/of/MiniMax/MiniMax-M2/configuration.json /path/of/MiniMax/MiniMax-M2-bf16
cp
/path/of/MiniMax/MiniMax-M2/generation_config.json /path/of/MiniMax/MiniMax-M2-bf16
...
...
@@ -62,6 +64,11 @@ cp /path/of/MiniMax/MiniMax-M2/tokenizer* /path/of/MiniMax/MiniMax-M2-bf16
cp
/path/of/MiniMax/MiniMax-M2/vocab.json /path/of/MiniMax/MiniMax-M2-bf16
```
**删掉 `/path/of/MiniMax/MiniMax-M2-bf16/config.json` 中的 `quantization_config` 字段内容,如图所示**
<div
align=
center
>
<img
src=
"./doc/quant.png"
/>
</div>
### vllm
#### 单机推理
```
bash
...
...
@@ -70,7 +77,6 @@ export ALLREDUCE_STREAM_WITH_COMPUTE=1
export
VLLM_MLA_DISABLE
=
0
export
VLLM_USE_FLASH_MLA
=
1
vllm serve /path/of/MiniMax/MiniMax-M2-bf16/
\
--trust-remote-code
\
--max-model-len
32768
\
...
...
@@ -93,10 +99,18 @@ curl http://localhost:8000/v1/chat/completions \
```
## 效果展示
-
MiniMax-M2 模型效果
<div
align=
center
>
<img
src=
"./doc/result
s
.png"
/>
<img
src=
"./doc/result
-minimax-m2
.png"
/>
</div>
-
MiniMax-M2.1 模型效果
<div
align=
center
>
<img
src=
"./doc/result-minimax-m2_1.png"
/>
</div>
### 精度
DCU与GPU精度一致,推理框架:vllm。
...
...
@@ -104,6 +118,7 @@ DCU与GPU精度一致,推理框架:vllm。
| 模型名称 | 权重大小 | DCU型号 | 最低卡数需求 |下载地址|
|:-----:|:----------:|:----------:|:---------------------:|:----------:|
| MiniMax-M2 | 230 B | K100AI | 8 |
[
下载地址
](
https://huggingface.co/MiniMaxAI/MiniMax-M2
)
|
| MiniMax-M2.1 | 230 B | K100AI | 8 |
[
下载地址
](
https://www.modelscope.cn/models/MiniMax/MiniMax-M2.1
)
|
## 源码仓库及问题反馈
-
https://developer.sourcefind.cn/codes/modelzoo/minimax-m2_vllm
...
...
doc/quant.png
0 → 100644
View file @
9d2097be
52 KB
doc/result
s
.png
→
doc/result
-minimax-m2
.png
View file @
9d2097be
File moved
doc/result-minimax-m2_1.png
0 → 100644
View file @
9d2097be
232 KB
model.properties
View file @
9d2097be
...
...
@@ -11,4 +11,4 @@ appCategory=代码生成
# 框架类型
frameType
=
vllm
# 加速卡类型
accelerateType
=
K100AI
\ No newline at end of file
accelerateType
=
K100AI,BW1000
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment