Commits · 35fa7129db3e5c00cd8aad5000f30eb6841fca83 · OpenDAS / dynamo

24 Apr, 2026 1 commit

feat(v4): cherry-pick #8665 onto release/deepseekv4 (#8709) · 35fa7129

Keiven C authored Apr 24, 2026


Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>

35fa7129

11 Apr, 2026 1 commit
- fix(llm): support reading eos_token_ids from tokenizer_config.json for models... · b425b65c
  Ryan McCormick authored Apr 10, 2026
```
fix(llm): support reading eos_token_ids from tokenizer_config.json for models like Qwen3.5 with <|im_end|> token (#8091)
```
  b425b65c
02 Apr, 2026 1 commit
- feat: support chat_template.json as a prompt formatter artifact (#7785) · 06a24503
  Neal Vaidya authored Apr 02, 2026
```
Closes https://github.com/ai-dynamo/dynamo/issues/7737
```
  06a24503
31 Mar, 2026 1 commit
- fix: allow having no rust tokenizer when using dyn-chat-processor vllm (#7697) · 76c70f41
  Neal Vaidya authored Mar 31, 2026
  
  76c70f41
15 Mar, 2026 1 commit
- feat: integrate fastokens BPE tokenizer backend (#7387) · da810a26
  Biswa Panda authored Mar 15, 2026
  
  da810a26
05 Mar, 2026 1 commit
- fix(llm): preserve reasoning content when tool-call starts mid-chunk (#6902) · 57e6a79f
  Graham King authored Mar 05, 2026
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  57e6a79f
25 Feb, 2026 1 commit
- feat: Tiktoken support (#6460) · 21fce9ba
  Nikita authored Feb 25, 2026
```
Signed-off-by: Nikita Sukharev <kaonael@gmail.com>
```
  21fce9ba
02 Jan, 2026 1 commit
- chore: update all copyright headers in repo to 2026 (#5130) · cf433e68
  Tushar Sharma authored Jan 02, 2026
```
Signed-off-by: Tushar Sharma <tusharma@nvidia.com>
```
  cf433e68
16 Dec, 2025 1 commit
- feat: video decoder in the frontend (#4719) · 74fcd4a9
  milesial authored Dec 16, 2025
```
Signed-off-by: Alexandre Milesi <milesial@users.noreply.github.com>
```
  74fcd4a9
12 Dec, 2025 1 commit

feat: DeepSeek V3.2 chat template support (#4797) · 1efc7d63

Vladislav Nosivskoy authored Dec 12, 2025


Signed-off-by: Vladislav Nosivskoy <vladnosiv@gmail.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

1efc7d63

18 Nov, 2025 1 commit

feat: Support toolcall parser for DeepSeek V3 and R1 (#4253) · 164b0c29

tangcy98 authored Nov 19, 2025


Signed-off-by: zhangzhang <tangchenyu@xiaohongshu.com>
Co-authored-by: zhangzhang <tangchenyu@xiaohongshu.com>
Co-authored-by: Ayush Agarwal <ayushag@nvidia.com>

164b0c29

08 Nov, 2025 1 commit
- fix: no more multiple finish reasons in stream (#4154) · 04f7579b
  Ayush Agarwal authored Nov 07, 2025
```
Signed-off-by: ayushag <ayushag@nvidia.com>
```
  04f7579b
04 Nov, 2025 1 commit
- feat: Image decoder in the frontend (#3971) · ae4b08ac
  milesial authored Nov 04, 2025
```
Signed-off-by: Alexandre Milesi <milesial@users.noreply.github.com>
```
  ae4b08ac
21 Oct, 2025 1 commit
- chore: tool calling -- sglang e2e tests (#3650) · 43329cd6
  Elyas Mehtabuddin authored Oct 20, 2025
  
  43329cd6
06 Oct, 2025 1 commit
- chore: added vllm tool + reasoning data (#3416) · d4b09631
  Ayush Agarwal authored Oct 06, 2025
```
Signed-off-by: ayushag <ayushag@nvidia.com>
```
  d4b09631
15 Sep, 2025 1 commit
- fix: Handle invalid JSON in config.json (#3043) · b1186aee
  Graham King authored Sep 15, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  b1186aee
17 Jul, 2025 1 commit
- feat: record + analyze logprobs (#1957) · 49b7a0d9
  Ryan Olson authored Jul 17, 2025
  
  49b7a0d9
22 May, 2025 1 commit

feat(dynamo-run): Allow setting KV cache block size (#1175) · 183f2b32

Graham King authored May 22, 2025

Example:
```
dynamo-run out=<engine> <model> --kv-cache-block-size 64
```

In a distributed system this goes on the worker node and is propagated to ingress via the model deployment card.

Previously hard coded to 16, which is now the default.

- Load context_length from model. Closes #1172
- Store context length and KV cache block size in Model Deployment Card #1170

183f2b32

08 May, 2025 1 commit

feat: Qwen3, Gemma3 and Llama4 support (#1002) · ceaeba3e

Graham King authored May 08, 2025

. New mistralrs and llamacpp version
. mistralrs: Handle Gemma 3 and Llama 4 as vision models
. Update the dynamo-run docs to use Qwen 3
. Our pre-processor now supports Llama 4's newer multi-modal `config.json`
. Upgrade minijinja to handle Qwen 3's prompt template

For Llama 4 we'll need to limit the max seq len. vllm says:
> To serve at least one request with the models's max seq len (10485760), (240.00 GiB KV cache is needed,...

I was able to run Llama 4 with llamacpp and a quantized GGUF, with Dynamo doing the pre-processing.

ceaeba3e

25 Feb, 2025 1 commit

refactor: move libs to lib dir · 08fcd7e9

Neelay Shah authored Feb 24, 2025


Signed-off-by: Neelay Shah <neelays@nvidia.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

08fcd7e9