update huggingface internlm-chat-7b model url (#546)

27e12477 · AllentDan · GitHub · 0d2a151e · 27e12477 · 27e12477
Unverified Commit 27e12477 authored Oct 12, 2023 by AllentDan Committed by GitHub Oct 12, 2023
Hide whitespace changes
Inline Side-by-side

Showing with 6 additions and 6 deletions

README.md README.md +1 -1

README_zh-CN.md README_zh-CN.md +1 -1

docs/en/kv_int8.md docs/en/kv_int8.md +2 -2

docs/zh_cn/kv_int8.md docs/zh_cn/kv_int8.md +2 -2

No files found.
--- a/README.md
+++ b/README.md
@@ -109,7 +109,7 @@ pip install lmdeploy
 # Make sure you have git-lfs installed (https://git-lfs.com)
 git lfs install
-git clone https://huggingface.co/internlm/internlm-chat-7b /path/to/internlm-chat-7b
+git clone https://huggingface.co/internlm/internlm-chat-7b-v1_1 /path/to/internlm-chat-7b
 # if you want to clone without large files – just their pointers
 # prepend your git clone with the following env var:

--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@@ -110,7 +110,7 @@ pip install lmdeploy
 # Make sure you have git-lfs installed (https://git-lfs.com)
 git lfs install
-git clone https://huggingface.co/internlm/internlm-chat-7b /path/to/internlm-chat-7b
+git clone https://huggingface.co/internlm/internlm-chat-7b-v1_1 /path/to/internlm-chat-7b
 # if you want to clone without large files – just their pointers
 # prepend your git clone with the following env var:

--- a/docs/en/kv_int8.md
+++ b/docs/en/kv_int8.md
@@ -69,7 +69,7 @@ python3 -m lmdeploy.turbomind.chat ./workspace
 ## GPU Memory Test
-The test object is the [internlm-chat-7b](https://huggingface.co/internlm/internlm-chat-7b) model.
+The test object is the [internlm-chat-7b](https://huggingface.co/internlm/internlm-chat-7b-v1_1) model.
 Testing method:
 1. Use `deploy.py` to convert the model, modify the maximum concurrency in the `workspace` configuration; adjust the number of requests in `llama_config.ini`.
@@ -93,7 +93,7 @@ As can be seen, the fp16 version requires 1030MB of GPU memory for each concurre
 ## Accuracy Test
-The test object is the [internlm-chat-7b](https://huggingface.co/internlm/internlm-chat-7b) command model.
+The test object is the [internlm-chat-7b](https://huggingface.co/internlm/internlm-chat-7b-v1_1) command model.
 Below is the result of PTQ quantization of `kCacheKVInt8` method with only 128 randomly selected data from the c4 dataset. The accuracy was tested using [opencompass](https://github.com/InternLM/opencompass) before and after quantization.

--- a/docs/zh_cn/kv_int8.md
+++ b/docs/zh_cn/kv_int8.md
@@ -69,7 +69,7 @@ python3 -m lmdeploy.turbomind.chat ./workspace
 ## 显存测试
-测试对象为 [internlm-chat-7b](https://huggingface.co/internlm/internlm-chat-7b) 模型。
+测试对象为 [internlm-chat-7b](https://huggingface.co/internlm/internlm-chat-7b-v1_1) 模型。
 测试方法：
 1. 使用 `deploy.py` 转换模型，修改 `workspace` 配置中的最大并发数；调整 `llama_config.ini` 中的请求数
@@ -93,7 +93,7 @@ python3 -m lmdeploy.turbomind.chat ./workspace
 ## 精度测试
-测试对象为 [internlm-chat-7b](https://huggingface.co/internlm/internlm-chat-7b) 指令模型。
+测试对象为 [internlm-chat-7b](https://huggingface.co/internlm/internlm-chat-7b-v1_1) 指令模型。
 以下是 `kCacheKVInt8` 方法仅从 c4 数据集，随机选择 128 条数据 PTQ 量化。量化前后均使用 [opencompass](https://github.com/InternLM/opencompass) 测试精度。