docs(README): disable ECC (#159)

* Update README_zh-CN.md * Update README.md * Update README_zh-CN.md * Update README.md * Update README_zh-CN.md

docs(README): disable ECC (#159)
* Update README_zh-CN.md * Update README.md * Update README_zh-CN.md * Update README.md * Update README_zh-CN.md
63bd5916 · tpoisonooo · GitHub · e7bc11b4 · 63bd5916 · 63bd5916
Unverified Commit 63bd5916 authored Jul 26, 2023 by tpoisonooo Committed by GitHub Jul 26, 2023
Hide whitespace changes
Inline Side-by-side

Showing with 3 additions and 1 deletion

README.md README.md +1 -0

README_zh-CN.md README_zh-CN.md +2 -1

No files found.
--- a/README.md
+++ b/README.md
@@ -89,6 +89,7 @@ docker run --gpus all --rm -v $(pwd)/workspace:/workspace -it openmmlab/lmdeploy
 ```{note}
 When inferring with FP16 precision, the InternLM-7B model requires at least 15.7G of GPU memory overhead on TurboMind. It is recommended to use NVIDIA cards such as 3090, V100, A100, etc.
+Disable GPU ECC can free up 10% memory, try `sudo nvidia-smi --ecc-config=0` and reboot system.
 ```
 #### Serving

--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@@ -87,7 +87,8 @@ docker run --gpus all --rm -v $(pwd)/workspace:/workspace -it openmmlab/lmdeploy
 ```
 ```{note}
-turbomind 在使用 FP16 精度推理 InternLM-7B 模型时，显存开销至少需要 15.7G。建议使用 3090, V100，A100等型号的显卡
+turbomind 在使用 FP16 精度推理 InternLM-7B 模型时，显存开销至少需要 15.7G。建议使用 3090, V100，A100等型号的显卡。
+关闭显卡的 ECC 可以腾出 10% 显存，执行 `sudo nvidia-smi --ecc-config=0` 重启系统生效。
 ```
 #### 部署推理服务