Unverified Commit e20233d3 authored by Woosuk Kwon's avatar Woosuk Kwon Committed by GitHub
Browse files

Revert "[Doc] Update supported_hardware.rst (#7276)" (#7467)

parent d6e634f3
...@@ -5,20 +5,18 @@ Supported Hardware for Quantization Kernels ...@@ -5,20 +5,18 @@ Supported Hardware for Quantization Kernels
The table below shows the compatibility of various quantization implementations with different hardware platforms in vLLM: The table below shows the compatibility of various quantization implementations with different hardware platforms in vLLM:
===================== ====== ======= ======= ===== ====== ======= ========= ======= ============== ========== ============== ====== ======= ======= ===== ====== ======= ========= ======= ============== ==========
Implementation Volta Turing Ampere Ada Hopper AMD GPU Intel GPU x86 CPU AWS Inferentia Google TPU Implementation Volta Turing Ampere Ada Hopper AMD GPU Intel GPU x86 CPU AWS Inferentia Google TPU
===================== ====== ======= ======= ===== ====== ======= ========= ======= ============== ========== ============== ====== ======= ======= ===== ====== ======= ========= ======= ============== ==========
AWQ ❌ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ AQLM ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
GPTQ ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ AWQ ❌ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
Marlin (GPTQ/AWQ/FP8) ❌ ❌ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ DeepSpeedFP ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
INT8 (W8A8) ❌ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ FP8 ❌ ❌ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
FP8 (W8A8) ❌ ❌ ❌ ✅ ✅ ❌ ❌ ❌ ❌ ❌ Marlin ❌ ❌ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
AQLM ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ GPTQ ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
bitsandbytes ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ SqueezeLLM ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
DeepSpeedFP ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ bitsandbytes ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
GGUF ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌ ============== ====== ======= ======= ===== ====== ======= ========= ======= ============== ==========
SqueezeLLM ✅ ✅ ✅ ✅ ✅ ❌ ❌ ❌ ❌ ❌
===================== ====== ======= ======= ===== ====== ======= ========= ======= ============== ==========
Notes: Notes:
^^^^^^ ^^^^^^
...@@ -29,4 +27,4 @@ Notes: ...@@ -29,4 +27,4 @@ Notes:
Please note that this compatibility chart may be subject to change as vLLM continues to evolve and expand its support for different hardware platforms and quantization methods. Please note that this compatibility chart may be subject to change as vLLM continues to evolve and expand its support for different hardware platforms and quantization methods.
For the most up-to-date information on hardware support and quantization methods, please check the `quantization directory <https://github.com/vllm-project/vllm/tree/main/vllm/model_executor/layers/quantization>`_ or consult with the vLLM development team. For the most up-to-date information on hardware support and quantization methods, please check the `quantization directory <https://github.com/vllm-project/vllm/tree/main/vllm/model_executor/layers/quantization>`_ or consult with the vLLM development team.
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment