Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
Lmdeploy
Commits
fa7cbc7a
Unverified
Commit
fa7cbc7a
authored
Jul 04, 2023
by
tpoisonooo
Committed by
GitHub
Jul 04, 2023
Browse files
Update quantization.md (#47)
parent
197b3ee1
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
2 deletions
+2
-2
docs/zh_cn/quantization.md
docs/zh_cn/quantization.md
+2
-2
No files found.
docs/zh_cn/quantization.md
View file @
fa7cbc7a
...
...
@@ -4,7 +4,7 @@
测试方法:
1.
运行
`deploy.py`
,切分 100B 模型到 8 个 GPU 上
2.
运行量化脚本,得到量化参数,放到 weights 目录
3.
修改配置文件,使
[
kCacheKVInt8
](
../src/turbomind/models/llama/llama_utils.h
)
选项生效
3.
修改配置文件,使
[
kCacheKVInt8
](
../
../
src/turbomind/models/llama/llama_utils.h
)
选项生效
4.
执行测试数据集,和 fp16 版本对比精度和显存使用情况
## 显存降低
...
...
@@ -58,4 +58,4 @@
| QA | openbookqa_fact | v1-4e92f0 | accuracy | -14.00 |
| QA | nq | v1-d2370e | score | -2.16 |
| QA | triviaqa | v1-ead882 | score | -0.43 |
| Security | crows_pairs | v1-8fe12f | accuracy | 11.08 |
\ No newline at end of file
| Security | crows_pairs | v1-8fe12f | accuracy | 11.08 |
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment