Bump to v0.1.6 (#139)

abf44cce · Casper · GitHub · 2bfa234f · abf44cce · abf44cce
Unverified Commit abf44cce authored Nov 02, 2023 by Casper Committed by GitHub Nov 02, 2023
Hide whitespace changes
Inline Side-by-side

Showing with 3 additions and 3 deletions

README.md README.md +1 -1

awq/__init__.py awq/__init__.py +1 -1

setup.py setup.py +1 -1

No files found.
--- a/README.md
+++ b/README.md
@@ -19,7 +19,7 @@
 AutoAWQ is an easy-to-use package for 4-bit quantized models. AutoAWQ speeds up models by 2x while reducing memory requirements by 3x compared to FP16. AutoAWQ implements the Activation-aware Weight Quantization (AWQ) algorithm for quantizing LLMs.  AutoAWQ was created and improved upon from the [original work](https://github.com/mit-han-lab/llm-awq) from MIT.
 *Latest News* 🔥
- [2023/11] AutoAWQ has been merged into 🤗 transformers. Example found in: [examples/basic_transformers](examples/basic_transformers.py).
+- [2023/11] AutoAWQ has been merged into 🤗 transformers. Now includes CUDA 12.1 wheels.
 - [2023/10] Mistral (Fused Modules), Bigcode, Turing support, Memory Bug Fix (Saves 2GB VRAM)
 - [2023/09] 1.6x-2.5x speed boost on fused models (now including MPT and Falcon).
 - [2023/09] Multi-GPU support, bug fixes, and better benchmark scripts available

--- a/awq/__init__.py
+++ b/awq/__init__.py
-__version__ = "0.1.5"
+__version__ = "0.1.6"
 from awq.models.auto import AutoAWQForCausalLM
\ No newline at end of file
--- a/setup.py
+++ b/setup.py
@@ -14,7 +14,7 @@ except Exception as ex:
    raise RuntimeError("Your system must have an Nvidia GPU for installing AutoAWQ")
 common_setup_kwargs = {
-    "version": f"0.1.5+cu{CUDA_VERSION}",
+    "version": f"0.1.6+cu{CUDA_VERSION}",
    "name": "autoawq",
    "author": "Casper Hansen",
    "license": "MIT",