Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
mamba2_pytorch
Commits
000d7bab
Commit
000d7bab
authored
Sep 21, 2024
by
luopl
Browse files
Update README.md
parent
185e8d8c
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
24 additions
and
41 deletions
+24
-41
README.md
README.md
+24
-41
No files found.
README.md
View file @
000d7bab
...
@@ -104,28 +104,21 @@ pip install e .
...
@@ -104,28 +104,21 @@ pip install e .
运行推理时会自动连接huggingface下载最新数据集于缓存目录,如无法连接到huggingface,可通过export HF_ENDPOINT=https://hf-mirror.com 设置镜像地址
运行推理时会自动连接huggingface下载最新数据集于缓存目录,如无法连接到huggingface,可通过export HF_ENDPOINT=https://hf-mirror.com 设置镜像地址
```
```
#数据集格式
#数据集格式
hellaswag/
openbookqa/
└── default
└── main
└── 0.1.0
└── 0.0.0
├── 362ac471216900f3f7c021863caac4eb7886347d0f76d90b6b4361f59ffea4d7
├── 388097ea7776314e93a529163e0fea805b8a6454
│ ├── cache-081f361bf081c0bf.arrow
│ ├── cache-5d43362a7601c065.arrow
│ ├── dataset_info.json
│ ├── dataset_info.json
│ ├──
hellaswag
-test.arrow
│ ├──
openbookqa
-test.arrow
│ ├──
hellaswag
-train.arrow
│ ├──
openbookqa
-train.arrow
│ └──
hellaswag
-validation.arrow
│ └──
openbookqa
-validation.arrow
├── 3
62ac471216900f3f7c021863caac4eb7886347d0f76d90b6b4361f59ffea4d7
_builder.lock
├── 3
88097ea7776314e93a529163e0fea805b8a6454
_builder.lock
└── 3
62ac471216900f3f7c021863caac4eb7886347d0f76d90b6b4361f59ffea4d7
.incomplete_info.lock
└── 3
88097ea7776314e93a529163e0fea805b8a6454
.incomplete_info.lock
...
...
```
```
也可下载离线数据,放于缓存目录~/.cache/huggingface/datasets/,根据自己的缓存地址存放:
数据集SCNet快速下载链接
[
datasets
](
http://113.200.138.88:18080/aidatasets/mamba2_data_test
)
## 训练
## 训练
...
@@ -134,7 +127,7 @@ hellaswag/
...
@@ -134,7 +127,7 @@ hellaswag/
## 推理
## 推理
运行推理时会自动连接huggingface下载模型文件,也可使用
modelscope
提前下载相关模型文件到缓存目录, 使用本地修改pretrained=/path_to_model/model_name
运行推理时会自动连接huggingface下载模型文件,也可使用
[
镜像网站
](
https://hf-mirror.com/
)
提前下载相关模型文件到缓存目录, 使用本地修改pretrained=/path_to_model/model_name
模型权重SCNet下载链接
[
models
](
http://113.200.138.88:18080/aimodels/state-spaces
)
模型权重SCNet下载链接
[
models
](
http://113.200.138.88:18080/aimodels/state-spaces
)
...
@@ -145,15 +138,15 @@ Evaluate:
...
@@ -145,15 +138,15 @@ Evaluate:
To run evaluations on Mamba-1 models
To run evaluations on Mamba-1 models
```
```
lm_eval --model mamba_ssm --model_args pretrained=state-spaces/mamba-130m --tasks lambada_openai,
hellaswag,piqa,
arc_easy,arc_challenge,winogrande,openbookqa --device cuda --batch_size 256
lm_eval --model mamba_ssm --model_args pretrained=state-spaces/mamba-130m --tasks lambada_openai,arc_easy,arc_challenge,winogrande,openbookqa --device cuda --batch_size 256
```
```
To run evaluations on Mamba-2 models:
To run evaluations on Mamba-2 models:
```
```
lm_eval --model mamba_ssm --model_args pretrained=state-spaces/mamba2-2.7b --tasks lambada_openai,
hellaswag,piqa,
arc_easy,arc_challenge,winogrande,openbookqa --device cuda --batch_size 256
lm_eval --model mamba_ssm --model_args pretrained=state-spaces/mamba2-2.7b --tasks lambada_openai,arc_easy,arc_challenge,winogrande,openbookqa --device cuda --batch_size 256
lm_eval --model mamba_ssm --model_args pretrained=state-spaces/transformerpp-2.7b --tasks lambada_openai,
hellaswag,piqa,
arc_easy,arc_challenge,winogrande,openbookqa --device cuda --batch_size 256
lm_eval --model mamba_ssm --model_args pretrained=state-spaces/transformerpp-2.7b --tasks lambada_openai,arc_easy,arc_challenge,winogrande,openbookqa --device cuda --batch_size 256
lm_eval --model mamba_ssm --model_args pretrained=state-spaces/mamba2attn-2.7b --tasks lambada_openai,
hellaswag,piqa,
arc_easy,arc_challenge,winogrande,openbookqa --device cuda --batch_size 256
lm_eval --model mamba_ssm --model_args pretrained=state-spaces/mamba2attn-2.7b --tasks lambada_openai,arc_easy,arc_challenge,winogrande,openbookqa --device cuda --batch_size 256
```
```
Inference :
Inference :
...
@@ -175,7 +168,7 @@ python benchmarks/benchmark_generation_mamba_simple.py --model-name "state-space
...
@@ -175,7 +168,7 @@ python benchmarks/benchmark_generation_mamba_simple.py --model-name "state-space
多卡推理使用accelerate,样例如下:
多卡推理使用accelerate,样例如下:
```
```
HIP_VISIBLE_DEVICES=0,1 accelerate launch -m lm_eval --model mamba_ssm --model_args pretrained=state-spaces/mamba2-2.7b --tasks lambada_openai,
hellaswag,piqa,
arc_easy,arc_challenge,winogrande,openbookqa --device cuda --batch_size 256
HIP_VISIBLE_DEVICES=0,1 accelerate launch -m lm_eval --model mamba_ssm --model_args pretrained=state-spaces/mamba2-2.7b --tasks lambada_openai,arc_easy,arc_challenge,winogrande,openbookqa --device cuda --batch_size 256
```
```
...
@@ -196,35 +189,29 @@ state-spaces/mamba2-2.7b result:
...
@@ -196,35 +189,29 @@ state-spaces/mamba2-2.7b result:
mamba_ssm (pretrained=state-spaces/mamba-130m), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 256
mamba_ssm (pretrained=state-spaces/mamba-130m), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 256
| Tasks |Version|Filter|n-shot| Metric | Value | |Stderr|
| Tasks |Version|Filter|n-shot| Metric | Value | |Stderr|
|--------------|------:|------|-----:|----------|------:|---|-----:|
|--------------|------:|------|-----:|----------|------:|---|-----:|
|winogrande | 1|none | 0|acc | 0.5217|± |0.0140|
|winogrande | 1|none | 0|acc | 0.5217|± |0.0140|
|piqa | 1|none | 0|acc | 0.6458|± |0.0112|
|openbookqa | 1|none | 0|acc | 0.1680|± |0.0167|
| | |none | 0|acc_norm | 0.6306|± |0.0113|
| | |none | 0|acc_norm | 0.2860|± |0.0202|
|openbookqa | 1|none | 0|acc | 0.1700|± |0.0168|
|lambada_openai| 1|none | 0|perplexity|16.0435|± |0.5091|
| | |none | 0|acc_norm | 0.2880|± |0.0203|
| | |none | 0|acc | 0.4421|± |0.0069|
|lambada_openai| 1|none | 0|perplexity|16.0456|± |0.5091|
|arc_easy | 1|none | 0|acc | 0.4785|± |0.0103|
| | |none | 0|acc | 0.4428|± |0.0069|
| | |none | 0|acc_norm | 0.4209|± |0.0101|
|hellaswag | 1|none | 0|acc | 0.3079|± |0.0046|
| | |none | 0|acc_norm | 0.3522|± |0.0048|
|arc_easy | 1|none | 0|acc | 0.4794|± |0.0103|
| | |none | 0|acc_norm | 0.4205|± |0.0101|
|arc_challenge | 1|none | 0|acc | 0.1988|± |0.0117|
|arc_challenge | 1|none | 0|acc | 0.1988|± |0.0117|
| | |none | 0|acc_norm | 0.2457|± |0.0126|
| | |none | 0|acc_norm | 0.2449|± |0.0126|
mamba_ssm (pretrained=state-spaces/mamba2-2.7b)
mamba_ssm (pretrained=state-spaces/mamba2-2.7b)
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
|--------------|------:|------|-----:|----------|-----:|---|-----:|
|--------------|------:|------|-----:|----------|-----:|---|-----:|
|winogrande | 1|none | 0|acc |0.6385|± |0.0135|
|winogrande | 1|none | 0|acc |0.6385|± |0.0135|
|piqa | 1|none | 0|acc |0.7628|± |0.0099|
| | |none | 0|acc_norm |0.7617|± |0.0099|
|openbookqa | 1|none | 0|acc |0.2940|± |0.0204|
|openbookqa | 1|none | 0|acc |0.2940|± |0.0204|
| | |none | 0|acc_norm |0.3880|± |0.0218|
| | |none | 0|acc_norm |0.3880|± |0.0218|
|lambada_openai| 1|none | 0|perplexity|4.0934|± |0.0888|
|lambada_openai| 1|none | 0|perplexity|4.0934|± |0.0888|
| | |none | 0|acc |0.6951|± |0.0064|
| | |none | 0|acc |0.6951|± |0.0064|
|hellaswag | 1|none | 0|acc |0.4961|± |0.0050|
| | |none | 0|acc_norm |0.6660|± |0.0047|
|arc_easy | 1|none | 0|acc |0.6957|± |0.0094|
|arc_easy | 1|none | 0|acc |0.6957|± |0.0094|
| | |none | 0|acc_norm |0.6481|± |0.0098|
| | |none | 0|acc_norm |0.6481|± |0.0098|
|arc_challenge | 1|none | 0|acc |0.3328|± |0.0138|
|arc_challenge | 1|none | 0|acc |0.3328|± |0.0138|
...
@@ -236,14 +223,10 @@ mamba_ssm (pretrained=state-spaces/mamba2attn-2.7b), gen_kwargs: (None), limit:
...
@@ -236,14 +223,10 @@ mamba_ssm (pretrained=state-spaces/mamba2attn-2.7b), gen_kwargs: (None), limit:
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
|--------------|------:|------|-----:|----------|-----:|---|-----:|
|--------------|------:|------|-----:|----------|-----:|---|-----:|
|winogrande | 1|none | 0|acc |0.6519|± |0.0134|
|winogrande | 1|none | 0|acc |0.6519|± |0.0134|
|piqa | 1|none | 0|acc |0.7573|± |0.0100|
| | |none | 0|acc_norm |0.7584|± |0.0100|
|openbookqa | 1|none | 0|acc |0.3040|± |0.0206|
|openbookqa | 1|none | 0|acc |0.3040|± |0.0206|
| | |none | 0|acc_norm |0.3900|± |0.0218|
| | |none | 0|acc_norm |0.3900|± |0.0218|
|lambada_openai| 1|none | 0|perplexity|3.8497|± |0.0810|
|lambada_openai| 1|none | 0|perplexity|3.8497|± |0.0810|
| | |none | 0|acc |0.7105|± |0.0063|
| | |none | 0|acc |0.7105|± |0.0063|
|hellaswag | 1|none | 0|acc |0.5029|± |0.0050|
| | |none | 0|acc_norm |0.6776|± |0.0047|
|arc_easy | 1|none | 0|acc |0.6987|± |0.0094|
|arc_easy | 1|none | 0|acc |0.6987|± |0.0094|
| | |none | 0|acc_norm |0.6633|± |0.0097|
| | |none | 0|acc_norm |0.6633|± |0.0097|
|arc_challenge | 1|none | 0|acc |0.3447|± |0.0139|
|arc_challenge | 1|none | 0|acc |0.3447|± |0.0139|
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment