Commit 67b31a9b authored by xuxzh1's avatar xuxzh1 🎱
Browse files

update README

parent c32de00b
......@@ -49,7 +49,7 @@ go build .
## 运行
```bash
export HSA_OVERRIDE_GFX_VERSION=设备型号(如: gfx906对应9.0.6;k100ai gfx928对应9.2.8)
export HSA_OVERRIDE_GFX_VERSION=设备型号(如: Z100L gfx906对应9.0.6;K100 gfx926对应9.2.6;K100AI gfx928对应9.2.8)
export ROCR_VISIBLE_DEVICES=所有设备号(0,1,2,3,4,5,6,...)/选择设备号
./ollama serve (选择可用设备,可通过上条命令输出结果查看)
# 新增fa和kv cache量化
......@@ -60,7 +60,7 @@ OLLAMA_FLASH_ATTENTION=1 OLLAMA_KV_CACHE_TYPE=q4_0 ./ollama serve
## deepseek-r1模型推理
```
export HSA_OVERRIDE_GFX_VERSION=设备型号(如: gfx906对应9.0.6;k100ai gfx928对应9.2.8)
export HSA_OVERRIDE_GFX_VERSION=设备型号(如: Z100L gfx906对应9.0.6;K100 gfx926对应9.2.6;K100AI gfx928对应9.2.8)
./ollama serve
./ollama run deepseek-r1:671b
```
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment