[Docs] Update Supported Matrix (#679)

* update supported matrix * change the default shard size when saving quantized weights * baichuan2 kv8

[Docs] Update Supported Matrix (#679)
* update supported matrix * change the default shard size when saving quantized weights * baichuan2 kv8
e641dd86 · pppppM · GitHub · ab1767cf · e641dd86 · e641dd86
Unverified Commit e641dd86 authored Nov 13, 2023 by pppppM Committed by GitHub Nov 13, 2023
Hide whitespace changes
Inline Side-by-side

Showing with 7 additions and 7 deletions

README.md README.md +3 -3

README_zh-CN.md README_zh-CN.md +3 -3

lmdeploy/lite/apis/auto_awq.py lmdeploy/lite/apis/auto_awq.py +1 -1

No files found.
--- a/README.md
+++ b/README.md
@@ -66,10 +66,10 @@ LMDeploy is a toolkit for compressing, deploying, and serving LLM, developed by
 |    SOLAR     |       Yes       | Yes  |   Yes   |  Yes  |  No  |
 | InternLM-7B  |       Yes       | Yes  |   Yes   |  Yes  |  No  |
 | InternLM-20B |       Yes       | Yes  |   Yes   |  Yes  |  No  |
-|   QWen-7B    |       Yes       | Yes  |   Yes   |  No   |  No  |
-|   QWen-14B   |       Yes       | Yes  |   Yes   |  No   |  No  |
+|   QWen-7B    |       Yes       | Yes  |   Yes   |  Yes  |  No  |
+|   QWen-14B   |       Yes       | Yes  |   Yes   |  Yes  |  No  |
 | Baichuan-7B  |       Yes       | Yes  |   Yes   |  Yes  |  No  |
-| Baichuan2-7B |       Yes       | Yes  |   No    |  No   |  No  |
+| Baichuan2-7B |       Yes       | Yes  |   Yes   |  Yes  |  No  |
 |  Code Llama  |       Yes       | Yes  |   No    |  No   |  No  |

 ### Pytorch

--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@@ -67,10 +67,10 @@ LMDeploy 由 [MMDeploy](https://github.com/open-mmlab/mmdeploy) 和 [MMRazor](ht
 |    SOLAR     |   Yes    | Yes  |   Yes   |  Yes  |  No  |
 | InternLM-7B  |   Yes    | Yes  |   Yes   |  Yes  |  No  |
 | InternLM-20B |   Yes    | Yes  |   Yes   |  Yes  |  No  |
-|   QWen-7B    |   Yes    | Yes  |   Yes   |  No   |  No  |
-|   QWen-14B   |   Yes    | Yes  |   Yes   |  No   |  No  |
+|   QWen-7B    |   Yes    | Yes  |   Yes   |  Yes  |  No  |
+|   QWen-14B   |   Yes    | Yes  |   Yes   |  Yes  |  No  |
 | Baichuan-7B  |   Yes    | Yes  |   Yes   |  Yes  |  No  |
-| Baichuan2-7B |   Yes    | Yes  |   No    |  No   |  No  |
+| Baichuan2-7B |   Yes    | Yes  |   Yes   |  Yes  |  No  |
 |  Code Llama  |   Yes    | Yes  |   No    |  No   |  No  |

 ### Pytorch

--- a/lmdeploy/lite/apis/auto_awq.py
+++ b/lmdeploy/lite/apis/auto_awq.py
@@ -83,7 +83,7 @@ def auto_awq(model: str,
    smooth_layers(layers, fc2fcs, norm2fcs, act_scales, w_group_size, device)
    quant_weights(model, fcs, w_bits, w_sym, w_group_size, device)

-    model.save_pretrained(work_dir)
+    model.save_pretrained(work_dir, max_shard_size='2GB')
    tokenizer.save_pretrained(work_dir)