Blame · docs/source/features/quantization/index.md · 19d98e0c7db96713f0e2201649159431177a56e2 · OpenDAS / vllm_cscc · GitLab

Find file Normal view History Permalink

index.md 290 Bytes

Newer Older

[Doc][2/N] Reorganize Models and Usage sections (#11755) Cyrus Leung committed Jan 06, 2025	1 2 3 4 5 6	`(quantization-index)= # Quantization Quantization trades off model precision for smaller memory footprint, allowing large models to be run on a wider range of devices.`
[Doc] Convert docs to use colon fences (#12471) Harry Mellor committed Jan 29, 2025	7	`:::{toctree}`
[Doc][2/N] Reorganize Models and Usage sections (#11755) Cyrus Leung committed Jan 06, 2025	8 9 10 11 12 13 14	`:caption: Contents :maxdepth: 1 supported_hardware auto_awq bnb gguf`
[Doc] int4 w4a16 example (#12585) Brian Dellabetta committed Jan 31, 2025	15	`int4`
[Doc][2/N] Reorganize Models and Usage sections (#11755) Cyrus Leung committed Jan 06, 2025	16 17	`int8 fp8`
[Docs] Update FP8 KV Cache documentation (#12238) Michael Goin committed Jan 23, 2025	18	`quantized_kvcache`
[Doc] Convert docs to use colon fences (#12471) Harry Mellor committed Jan 29, 2025	19	`:::`