Commit 28ae7e24 authored by liam's avatar liam
Browse files

: finish patch

parents c2b4dc80 a6a1cc05
...@@ -194,17 +194,20 @@ It features the following arguments: ...@@ -194,17 +194,20 @@ It features the following arguments:
- `--cpu_infer`: Int (default=10). The number of CPUs used for inference. Should ideally be set to the (total number of cores - 2). - `--cpu_infer`: Int (default=10). The number of CPUs used for inference. Should ideally be set to the (total number of cores - 2).
<h3 id="supported-model"> Suggested Model</h3> <h3 id="suggested-model"> Suggested Model</h3>
| Model Name | Model Size | VRAM | Minimum DRAM | Recommended DRAM | | Model Name | Model Size | VRAM | Minimum DRAM | Recommended DRAM |
| ------------------------------ | ---------- | ----- | --------------- | ----------------- | | ------------------------------ | ---------- | ----- | --------------- | ----------------- |
| DeepSeek-V2-q4_k_m | 133G | 24G | 136G | 192G | | DeepSeek-V2-q4_k_m | 133G | 11G | 136G | 192G |
| DeepSeek-V2.5-q4_k_m | 133G | 11G | 136G | 192G |
| DeepSeek-V2.5-IQ4_XS | 117G | 10G | 107G | 128G |
| Qwen2-57B-A14B-Instruct-q4_k_m | 33G | 8G | 34G | 64G | | Qwen2-57B-A14B-Instruct-q4_k_m | 33G | 8G | 34G | 64G |
| DeepSeek-V2-Lite-q4_k_m | 9.7G | 3G | 13G | 16G | | DeepSeek-V2-Lite-q4_k_m | 9.7G | 3G | 13G | 16G |
| Mixtral-8x7B-q4_k_m | 25G | 1.6G | 51G | 64G | | Mixtral-8x7B-q4_k_m | 25G | 1.6G | 51G | 64G |
| Mixtral-8x22B-q4_k_m | 80G | 4G | 86.1G | 96G | | Mixtral-8x22B-q4_k_m | 80G | 4G | 86.1G | 96G |
| InternLM2.5-7B-Chat-1M | 15.5G | 15.5G | 8G(32K context) | 150G (1M context) | | InternLM2.5-7B-Chat-1M | 15.5G | 15.5G | 8G(32K context) | 150G (1M context) |
More will come soon. Please let us know which models you are most interested in. More will come soon. Please let us know which models you are most interested in.
Be aware that you need to be subject to their corresponding model licenses when using [DeepSeek](https://huggingface.co/deepseek-ai/DeepSeek-V2/blob/main/LICENSE) and [QWen](https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE). Be aware that you need to be subject to their corresponding model licenses when using [DeepSeek](https://huggingface.co/deepseek-ai/DeepSeek-V2/blob/main/LICENSE) and [QWen](https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE).
...@@ -249,10 +252,6 @@ Be aware that you need to be subject to their corresponding model licenses when ...@@ -249,10 +252,6 @@ Be aware that you need to be subject to their corresponding model licenses when
# GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat-0628 # GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/deepseek-ai/DeepSeek-V2-Chat-0628
# python -m ktransformers.local_chat --model_path ./DeepSeek-V2-Chat-0628 --gguf_path ./DeepSeek-V2-Chat-0628-GGUF # python -m ktransformers.local_chat --model_path ./DeepSeek-V2-Chat-0628 --gguf_path ./DeepSeek-V2-Chat-0628-GGUF
```
```
``` ```
| model name | weights download link | | model name | weights download link |
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment