Unverified Commit d374a2dd authored by Casper's avatar Casper Committed by GitHub
Browse files

Latest news

parent bcf78947
# AutoAWQ # AutoAWQ
AutoAWQ is a package that implements the Activation-aware Weight Quantization (AWQ) algorithm for quantizing LLMs. AutoAWQ will speed up your LLM by at least 2x compared to FP16. AutoAWQ was created and improved upon from the [original work](https://github.com/mit-han-lab/llm-awq) from MIT. <p align="center">
| <a href="https://github.com/casper-hansen/AutoAWQ/issues/32"><b>Roadmap</b></a> | <a href="https://github.com/casper-hansen/AutoAWQ/tree/main/examples"><b>Examples</b></a> | <a href="https://github.com/casper-hansen/AutoAWQ/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22"><b>Issues: Help Wanted</b></a> |
</p>
Roadmap: AutoAWQ is an easy-to-use package for 4-bit quantized models. AutoAWQ speeds up models by 2x while reducing memory requirements by 3x compared to FP16. AutoAWQ implements the Activation-aware Weight Quantization (AWQ) algorithm for quantizing LLMs. AutoAWQ was created and improved upon from the [original work](https://github.com/mit-han-lab/llm-awq) from MIT.
- [x] Publish pip package *Latest News* 🔥
- [ ] Refactor quantization code - [2023/09] Multi-GPU support, bug fixes, and better benchmark scripts available
- [ ] Support more models - [2023/08] PyPi package released and AutoModel class available
- [ ] Optimize the speed of models
## Install ## Install
...@@ -62,12 +63,11 @@ The detailed support list: ...@@ -62,12 +63,11 @@ The detailed support list:
| Falcon | 7B/40B | | Falcon | 7B/40B |
| OPT | 125m/1.3B/2.7B/6.7B/13B/30B | | OPT | 125m/1.3B/2.7B/6.7B/13B/30B |
| Bloom | 560m/3B/7B/ | | Bloom | 560m/3B/7B/ |
| LLaVA-v0 | 13B |
| GPTJ | 6.7B | | GPTJ | 6.7B |
## Usage ## Usage
Below, you will find examples for how to easily quantize a model and run inference. Below, you will find examples of how to easily quantize a model and run inference.
### Quantization ### Quantization
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment