Commits · 8d17774f924da6a3b730289f351205f3b17095c6 · xdb4_94051 / vllm

19 Nov, 2023 1 commit
- Add AWQ support for all models (#1714) · 8d17774f
  Woosuk Kwon authored Nov 18, 2023
  
  8d17774f
16 Nov, 2023 2 commits

Revert `MptConfig` to `MPTConfig` (#1668) · b514d3c4
Megha Agarwal authored Nov 16, 2023

b514d3c4

TP/quantization/weight loading refactor part 2 - Refactor quantized linear... · 7076fa1c

Zhuohan Li authored Nov 15, 2023

TP/quantization/weight loading refactor part 2 - Refactor quantized linear logic and extend quantization support to all models (#1622)

Refactor the tensor parallelism, quantization, and weight-loading codes.

Summary of the new features enabled by this PR:
- **All models** are able to be quantized with AWQ and SqueezeLLM, and [soon GPTQ](https://github.com/vllm-project/vllm/pull/1580).
- Model loading code became much simpler.
- Support model parallelism for all MQA/GQA models when the number of key/value heads is smaller than the tensor parallel size.

7076fa1c

01 Nov, 2023 1 commit
- Remove `MPTConfig` (#1529) · 1fe09900
  Woosuk Kwon authored Nov 01, 2023
  
  1fe09900
02 Oct, 2023 1 commit
- TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic (#1181) · ba0bfd40
  Zhuohan Li authored Oct 02, 2023
  
  ba0bfd40
13 Sep, 2023 1 commit

Add Model Revision Support (#1014) · ab019eea

Jasmond L authored Sep 14, 2023


Co-authored-by: Jasmond Loh <Jasmond.Loh@hotmail.com>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>

ab019eea

07 Sep, 2023 1 commit
- Enable safetensors loading for all models (#974) · c957c741
  Zhuohan Li authored Sep 07, 2023
  
  c957c741
05 Sep, 2023 1 commit
- Align vLLM's beam search implementation with HF generate (#857) · 002800f0
  Zhuohan Li authored Sep 04, 2023
  
  002800f0
09 Jul, 2023 1 commit
- [Model] Add support for GPT-J (#226) · c8948361
  Andre Slavescu authored Jul 08, 2023
```
Co-authored-by: woWoosuk Kwon <woosuk.kwon@berkeley.edu>
```
  c8948361
03 Jul, 2023 1 commit
- [Model] Add support for MPT (#334) · 404422f4
  Woosuk Kwon authored Jul 03, 2023
  
  404422f4