Commits · 2358ca527bd0d2f8a99ced33586cc0d72ed51092 · OpenDAS / vllm_cscc

18 Feb, 2025 1 commit
- [Doc]: Improve feature tables (#13224) · 2358ca52
  Harry Mellor authored Feb 18, 2025
```
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
```
  2358ca52
05 Feb, 2025 1 commit
- [Doc] Remove performance warning for auto_awq.md (#12743) · c53dc466
  Michael Goin authored Feb 05, 2025
  
  c53dc466
31 Jan, 2025 1 commit

[Doc] int4 w4a16 example (#12585) · 44bbca78

Brian Dellabetta authored Jan 31, 2025

Based on a request by @mgoin , with @kylesayrs we have added an example
doc for int4 w4a16 quantization, following the pre-existing int8 w8a8
quantization example and the example available in
[`llm-compressor`](https://github.com/vllm-project/llm-compressor/blob/main/examples/quantization_w4a16/llama3_example.py

)

FIX #n/a (no issue created)

@kylesayrs and I have discussed a couple additional improvements for the
quantization docs. We will revisit at a later date, possibly including:
- A section for "choosing the correct quantization scheme/ compression
technique"
- Additional vision or audio calibration datasets

---------
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>

44bbca78

29 Jan, 2025 1 commit
- [Doc] Convert docs to use colon fences (#12471) · dd6a3a02
  Harry Mellor authored Jan 29, 2025
```
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
```
  dd6a3a02
23 Jan, 2025 2 commits

[FP8][Kernel] Dynamic kv cache scaling factors computation (#11906) · e97f802b

Gregory Shtrasberg authored Jan 23, 2025


Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
Co-authored-by: Micah Williamson <micah.williamson@amd.com>

e97f802b

[Docs] Update FP8 KV Cache documentation (#12238) · 01a55941

Michael Goin authored Jan 22, 2025


Signed-off-by: mgoin <michael@neuralmagic.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>

01a55941

15 Jan, 2025 1 commit
- [Doc] Update examples to remove SparseAutoModelForCausalLM (#12062) · 3f9b7ab9
  Kyle Sayers authored Jan 15, 2025
```
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
```
  3f9b7ab9
14 Jan, 2025 1 commit

[Doc] Update Quantization Hardware Support Documentation (#12025) · 8a1f938e

TJian authored Jan 14, 2025


Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>

8a1f938e

12 Jan, 2025 1 commit
- [CI/Build] Add markdown linter (#11857) · 43f3d9e6
  Rafael Vasquez authored Jan 12, 2025
```
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
```
  43f3d9e6
08 Jan, 2025 1 commit
- [Doc] Move examples into categories (#11840) · aba8d6ee
  Harry Mellor authored Jan 08, 2025
```
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
```
  aba8d6ee
06 Jan, 2025 1 commit
- [Doc][2/N] Reorganize Models and Usage sections (#11755) · ee77fdb5
  Cyrus Leung authored Jan 06, 2025
```
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
```
  ee77fdb5