Commits · 34cda778a091d4e1fd204cfde4a0f5e2b5616ac2 · OpenDAS / vllm_cscc

16 Jul, 2025 1 commit
- [Frontend] OpenAI Responses API supports input image (#20975) · 34cda778
  Chauncey authored Jul 16, 2025
```
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
```
  34cda778
10 Jul, 2025 1 commit
- [V0][V1][Core] Add outlines integration for V1, and update V0 integration. (#15975) · d6902ce7
  Nathan Hoos authored Jul 10, 2025
```
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>
```
  d6902ce7
07 Jul, 2025 1 commit
- Implement OpenAI Responses API [1/N] (#20504) · 462b2692
  Woosuk Kwon authored Jul 06, 2025
```
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
```
  462b2692
03 Jul, 2025 1 commit
- [Tests] Update online DP tests to verify that requests are balanced (#20157) · 67d25eca
  Nick Hill authored Jul 03, 2025
```
Signed-off-by: Nick Hill <nhill@redhat.com>
```
  67d25eca
03 Jun, 2025 1 commit
- [Misc] Add SPDX-FileCopyrightText (#19100) · 02f0c7b2
  Simon Mo authored Jun 03, 2025
```
Signed-off-by: simon-mo <simon.mo@hey.com>
```
  02f0c7b2
30 May, 2025 1 commit
- [Perf] API-server scaleout with many-to-many server-engine comms (#17546) · 2dbe8c07
  Nick Hill authored May 30, 2025
  
  2dbe8c07
23 May, 2025 1 commit
- [Bugfix] Migrate to REGEX Library to prevent catastrophic backtracking (#18454) · 4fc1bf81
  Feng XiaoLong authored May 24, 2025
```
Signed-off-by: Crucifixion-Fxl <xmufxl@gmail.com>
Co-authored-by: Crucifixion-Fxl <xmufxl@gmail.com>
```
  4fc1bf81
14 May, 2025 1 commit

[V1] Structured Outputs + Thinking compatibility (#16577) · 2fc9075b

Aaron Pham authored May 14, 2025


Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: Russell Bryant <rbryant@redhat.com>

2fc9075b

13 May, 2025 1 commit

[Feature][V1] Support `tool_choice: required` when using Xgrammar as the... · dc1a8217

Chauncey authored May 13, 2025


[Feature][V1]  Support `tool_choice: required` when using Xgrammar as the `StructuredOutputBackend`. (#17845)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

dc1a8217

12 May, 2025 2 commits

[CI] Make JSON output tests less likely to fail (#17859) · ebab1ac3
Russell Bryant authored May 12, 2025
```
Signed-off-by: Russell Bryant <rbryant@redhat.com>
```
ebab1ac3

[Bugfix] validate grammar and throw 400 error instead of crashing the engine... · 08bf7840

Cheng Kuan Yong Jason authored May 12, 2025


[Bugfix] validate grammar and throw 400 error instead of crashing the engine when xgrammar validation fails (#17623)
Signed-off-by: Jason Cheng <jasoncky96@gmail.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>

08bf7840

08 May, 2025 1 commit

[V1] Add VLLM_ALLOW_INSECURE_SERIALIZATION env var (#17490) · 6930a411

Russell Bryant authored May 08, 2025


Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Nick Hill <nhill@redhat.com>

6930a411

01 May, 2025 1 commit
- [CI][TPU] Skip structured outputs+spec decode tests on TPU (#17510) · 17b4d85f
  Michael Goin authored Apr 30, 2025
```
Signed-off-by: mgoin <mgoin64@gmail.com>
```
  17b4d85f
30 Apr, 2025 1 commit
- [V1][Feature] Enable Speculative Decoding with Structured Outputs (#14702) · 34120f5a
  Benjamin Chislett authored Apr 29, 2025
```
Signed-off-by: Benjamin Chislett <benjamin.chislett@centml.ai>
Signed-off-by: Benjamin Chislett <chislett.ben@gmail.com>
```
  34120f5a
29 Apr, 2025 2 commits
- Simplify (and fix) passing of guided decoding backend options (#17008) · a6977dbd
  Harry Mellor authored Apr 29, 2025
```
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
```
  a6977dbd
- implement Structural Tag with Guidance backend (#17333) · 86d9fc29
  Michał Moskal authored Apr 28, 2025
```
Signed-off-by: Michal Moskal <michal@moskal.me>
```
  86d9fc29
26 Apr, 2025 1 commit
- [V1] Add `structural_tag` support using xgrammar (#17085) · f8acd01f
  Russell Bryant authored Apr 26, 2025
  
  f8acd01f
25 Apr, 2025 1 commit

[Bugfix] remove fallback in guided_json (int range, patterns) (#16725) · 6aae216b

Sangyeon Cho authored Apr 25, 2025


Signed-off-by: csy1204 <josang1204@gmail.com>
Co-authored-by: 조상연[플레이스 AI] <sang-yeon.cho@navercorp.com>

6aae216b

24 Apr, 2025 1 commit
- Disable enforce_eager for V1 TPU sampler and structured output tests (#17016) · 14288d13
  Michael Goin authored Apr 24, 2025
```
Signed-off-by: mgoin <mgoin64@gmail.com>
```
  14288d13
23 Apr, 2025 1 commit
- [Frontend] Support guidance:no-additional-properties for compatibility with xgrammar (#15949) · 3cde34a4
  Travis Johnson authored Apr 23, 2025
```
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
```
  3cde34a4
22 Apr, 2025 1 commit

[Bugfix] Fix the issue where llm.generate cannot be called repeatedly after... · acba33a0

Chauncey authored Apr 22, 2025


[Bugfix] Fix the issue where llm.generate cannot be called repeatedly after setting GuidedDecodingParams (#16767)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>

acba33a0

12 Apr, 2025 1 commit
- [Feature][V1] Add xgrammar to support minLength, maxLength with test (#16516) · e92d7085
  leon-seidel authored Apr 12, 2025
```
Signed-off-by: Leon Seidel <leon.seidel@fau.de>
```
  e92d7085
02 Apr, 2025 1 commit
- [V1] Fix json_object support with xgrammar (#15488) · 14e53ed1
  Russell Bryant authored Apr 02, 2025
```
Signed-off-by: Russell Bryant <rbryant@redhat.com>
```
  14e53ed1
01 Apr, 2025 1 commit
- [CI] Disable flaky structure decoding test temporarily. (#15892) · 7e3f7a4e
  Roger Wang authored Apr 01, 2025
```
Signed-off-by: Roger Wang <ywang@roblox.com>
```
  7e3f7a4e
30 Mar, 2025 1 commit
- [Bugfix] Fix Mistral guided generation using xgrammar (#15704) · 6909a762
  Julien Denize authored Mar 30, 2025
```
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
```
  6909a762
29 Mar, 2025 1 commit
- [CI] Speed up V1 structured output tests (#15718) · 7a799208
  Russell Bryant authored Mar 29, 2025
```
Signed-off-by: Russell Bryant <rbryant@redhat.com>
```
  7a799208
28 Mar, 2025 2 commits
- [V1] Support disable_any_whtespace for guidance backend (#15584) · 7329ff54
  Russell Bryant authored Mar 28, 2025
```
Signed-off-by: Russell Bryant <rbryant@redhat.com>
```
  7329ff54
- [Bugfix][v1] xgrammar structured output supports Enum. (#15594) · 3b00ff91
  Chauncey authored Mar 28, 2025
```
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
```
  3b00ff91
25 Mar, 2025 1 commit

[V1] guidance backend for structured output + `auto` fallback mode (#14779) · a09ad90a

Russell Bryant authored Mar 25, 2025


Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Loc Huynh <jc1da.3011@gmail.com>
Co-authored-by: Michal Moskal <michal@moskal.me>

a09ad90a

22 Mar, 2025 1 commit
- [V1] Add `disable-any-whitespace` option support for xgrammar (#15316) · eb63ea1e
  Russell Bryant authored Mar 22, 2025
```
Signed-off-by: Russell Bryant <rbryant@redhat.com>
```
  eb63ea1e
17 Mar, 2025 2 commits

[Fix][Structured Output] using vocab_size to construct matcher (#14868) · c0efdd65

Aaron Pham authored Mar 17, 2025

Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>

c0efdd65

[V1] Enable Entrypoints Tests (#14903) · aecc780d
Robert Shaw authored Mar 16, 2025

aecc780d

14 Mar, 2025 1 commit
- [V1] Fix model parameterization for structured output tests (#14833) · 46f98893
  Russell Bryant authored Mar 14, 2025
```
Signed-off-by: Russell Bryant <rbryant@redhat.com>
```
  46f98893
12 Mar, 2025 2 commits
- [BugFix][V1] Fix parallel sampling finishing/aborts (#14512) · f5d3acd4
  Nick Hill authored Mar 12, 2025
```
Signed-off-by: Nick Hill <nhill@redhat.com>
```
  f5d3acd4
- [V1][Core] Support MistralTokenizer for Structured Output (#14625) · 77a318bd
  Aaron Pham authored Mar 11, 2025
```
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
```
  77a318bd
11 Mar, 2025 1 commit
- [V1] Add regex structured output support with xgrammar (#14590) · 4bf82d4b
  Russell Bryant authored Mar 11, 2025
```
Signed-off-by: Russell Bryant <rbryant@redhat.com>
```
  4bf82d4b
07 Mar, 2025 1 commit

[V1][Core] Support for Structured Outputs (#12388) · 80e9afb5

Aaron Pham authored Mar 07, 2025


Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Nick Hill <nhill@redhat.com>

80e9afb5

03 Mar, 2025 1 commit
- Update deprecated Python 3.8 typing (#13971) · cf069aa8
  Harry Mellor authored Mar 03, 2025
  
  cf069aa8
24 Feb, 2025 1 commit
- [V1] V1 engine implements parallel sampling (AsyncLLM and LLMEngine) (#10980) · befc402d
  afeldman-nm authored Feb 24, 2025
```
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
```
  befc402d
07 Feb, 2025 1 commit

[V1] Logprobs and prompt logprobs support (#9880) · 0630d453

afeldman-nm authored Feb 07, 2025



This PR is adding support for sample logprobs & prompt logprobs to vLLM v1.

New behavior:

- During model execution, model runner computes sample logprobs (if user-provided logprobs setting is not None) and prompt logprobs (if user-provided prompt_logprobs setting is not None). For both sample and prompt logprobs, the engine core returns 3 vectors: token ids, token logprob values, token ranks. Ranks reflect tokens' 1-indexed positions in the vocabulary vector after sorting the vocabulary by log probability in descending order.
- In scheduler.update_from_output(), sample and prompt logprobs are incorporated into the EngineCoreOutput data structure which is transferred to the engine client. If multiprocessing is enabled, then sample and prompt logprobs will be (de)serialized when the EngineCoreOutput data structure is (de)serialized.
- During output processing, the LogprobsProcessor transforms the triplet of token ids, token logprobs values, and token ranks into the OpenAI-compatible List[Dict[token id,Logprob]] format (for sample and prompt logprobs respectively.)
- Each Logprob instance (whether sample- or prompt-) consists of a token's log-probability, rank, and detokenized string representation. Note that logprob detokenization is handled by the LogprobsProcessor not the detokenizer.
Signed-off-by: Andrew Feldman <afeldman@neuralmagic.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: Nick Hill <nhill@redhat.com>

0630d453