Commits · 012236ee4e9ce974765c3e60c10945f50a68ce08 · OpenDAS / dynamo

11 Mar, 2026 1 commit
- feat(anthropic): add thinking block support and preamble stripping to /v1/messages (#7137) · 012236ee
  MatejKosec authored Mar 10, 2026
```
Signed-off-by: Matej Kosec <mkosec@nvidia.com>
```
  012236ee
06 Mar, 2026 1 commit

fix: reject prompts exceeding max_seq_len with HTTP 400 (#6635) · 124ecd98

Yuewei Na authored Mar 05, 2026


Signed-off-by: Yuewei Na <nv-yna@users.noreply.github.com>
Co-authored-by: Yuewei Na <nv-yna@users.noreply.github.com>

124ecd98

10 Feb, 2026 1 commit
- feat: Add metric tokenizer_latency_ms (#6092) · f1bcb175
  Graham King authored Feb 10, 2026
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  f1bcb175
05 Feb, 2026 1 commit

chore: remove unused NIM specific code (part 2) (#5893) · cb7ebdd7

Keiven C authored Feb 05, 2026


Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>

cb7ebdd7

02 Jan, 2026 1 commit
- chore: update all copyright headers in repo to 2026 (#5130) · cf433e68
  Tushar Sharma authored Jan 02, 2026
```
Signed-off-by: Tushar Sharma <tusharma@nvidia.com>
```
  cf433e68
19 Dec, 2025 1 commit
- feat: Runtime media decoder config (#5011) · d2faf0e6
  milesial authored Dec 18, 2025
```
Signed-off-by: Alexandre Milesi <milesial@users.noreply.github.com>
```
  d2faf0e6
02 Dec, 2025 1 commit

fix: ModelDeploymentCard obtains full set of eos_token_ids by taking union... · 4ace4c85

GuanLuo authored Dec 02, 2025

fix: ModelDeploymentCard obtains full set of eos_token_ids by taking union from different files (#3192)
Signed-off-by: Guan Luo <gluo@nvidia.com>
Signed-off-by: Guan Luo <41310872+GuanLuo@users.noreply.github.com>

4ace4c85

17 Nov, 2025 1 commit

refactor: centralize environment variable constants (#4083) · 0e77d344

Keiven C authored Nov 17, 2025


Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>

0e77d344

03 Nov, 2025 1 commit
- feat: Reject unsupported parameters with 400 Bad Request (#4021) · c837b5ba
  KrishnanPrash authored Nov 03, 2025
```
Signed-off-by: Krishnan Prashanth <kprashanth@nvidia.com>
```
  c837b5ba
31 Oct, 2025 1 commit
- feat: Media HTTP fetching and b64 decoding (#3967) · e30a3054
  milesial authored Oct 31, 2025
```
Signed-off-by: Alexandre Milesi <milesial@users.noreply.github.com>
```
  e30a3054
27 Oct, 2025 1 commit
- feat: Media URL passthrough in OAI preprocessor (#3733) · a79122c6
  milesial authored Oct 27, 2025
```
Signed-off-by: Alexandre Milesi <30204471+milesial@users.noreply.github.com>
```
  a79122c6
17 Sep, 2025 2 commits
- feat: add chat_template_kwargs param to v1/chat/completion (#3016) · eb6722e3
  Chi McIsaac authored Sep 17, 2025
```
Signed-off-by: Chi McIsaac <chixie.mcisaac@gmail.com>
```
  eb6722e3
- feat: Make part of discovery re-usable (#3073) · 9060ce12
  Graham King authored Sep 17, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  9060ce12
05 Sep, 2025 1 commit
- fix: Load the tokenizer JSON once for chat and completions. (#2910) · cb5a657a
  Graham King authored Sep 05, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  cb5a657a
03 Sep, 2025 1 commit
- feat: Add --custom-jinja-template argument to pass a custom chat template for vLLM (#2829) · c920cbd9
  KrishnanPrash authored Sep 03, 2025
```
Signed-off-by: Krishnan Prashanth <kprashanth@nvidia.com>
```
  c920cbd9
22 Aug, 2025 1 commit
- chore: Rust to 1.89 and edition 2024 (#2659) · bce74588
  Graham King authored Aug 22, 2025
  
  bce74588
19 Aug, 2025 1 commit
- chore: Bring async-openai into repo as request starter (#2520) · 199b9a30
  nachiketb-nvidia authored Aug 19, 2025
```
Co-authored-by: Graham King <grahamk@nvidia.com>
```
  199b9a30
12 Aug, 2025 1 commit

feat: Add frontend support for `min_tokens` and `ignore_eos` (outside of... · 18bb779e

KrishnanPrash authored Aug 12, 2025

feat: Add frontend support for `min_tokens` and `ignore_eos` (outside of `nvext`) and Structured Output / Guided Decoding (#2380)
Signed-off-by: KrishnanPrash <140860868+KrishnanPrash@users.noreply.github.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
Co-authored-by: Ayush Agarwal <ayushag@nvidia.com>

18bb779e

07 Aug, 2025 2 commits
- chore: Remove service_name from ModelDeploymentCard (#2349) · 1954fcfa
  Graham King authored Aug 07, 2025
  
  1954fcfa
- fix: improve HF token handling in preprocessor tests (#2321) · ccc8815b
  Keiven C authored Aug 06, 2025
```
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
```
  ccc8815b
22 May, 2025 1 commit

fix: Fix race condition in kv_router unit test (#1174) · 3bde1e45

Graham King authored May 22, 2025

Removed the hard coded sleeps, explained what we're testing.

Closes https://github.com/ai-dynamo/dynamo/issues/1132

The race condition is that `apply_event` sends a message on a channel, it does not directly apply the event. At some later point the tokio runtime schedules the task running the channel receiver, which applies the event. If that had not happened yet the test would fail.

3bde1e45

06 May, 2025 1 commit

feat: dynamo-run <-> python interop (#934) · 99cd9d85

Graham King authored May 05, 2025

Adding this to a Python script makes it register on the network so that `dynamo-run` can discover it and send it requests:
```
from dynamo.llm import register_llm

MODEL = "Qwen/Qwen2.5-0.5B-Instruct"
await register_llm(endpoint, MODEL, 3)
```

Full vllm example, with pre-processing in dynamo:
- `dynamo-run in=text out=dyn://dynamo.backend.generate`
- `cd lib/bindings/python/examples/hello_world`
- `python server_vllm.py`

This builds on top of the work to move pre-processor to ingress side. It means we can decouple Rust and Python using NATS as the bus.

The `register_llm` call does this:

- Download the model from HF if necessary
- Load the model deployment card from the HF folder or extract from GGUF
- Push the tokenizer config etc into NATS object store so ingress can access it from a different machine
- Publish the model deployment card to ETCD

99cd9d85

08 Mar, 2025 1 commit
- chore: rename dynamo (#44) · 602352ce
  Neelay Shah authored Mar 08, 2025
```
Co-authored-by: Biswa Panda <biswa.panda@gmail.com>
```
  602352ce
05 Mar, 2025 1 commit
- refactor: rename triton_distributed to dynemo (#22) · 1af7433b
  Neelay Shah authored Mar 05, 2025
```
Co-authored-by: Graham King <grahamk@nvidia.com>
```
  1af7433b
27 Feb, 2025 1 commit
- refactor: rename ChatCompletionRequest to NvCreateChatCompletionRequest (#284) · 96866f43
  Paul Hendricks authored Feb 27, 2025
  
  96866f43
26 Feb, 2025 1 commit
- refactor: using async_openai · 86aff237
  Paul Hendricks authored Feb 26, 2025
```
Co-authored-by: Graham King <grahamk@nvidia.com>
```
  86aff237
25 Feb, 2025 3 commits

feat: tio support preprocessor (#265) · 72064d84

Graham King authored Feb 25, 2025

Add backend type `EngineConfig::StaticCore` that wraps the engine in a preprocessor (prompt templating and tokenization).

Add example engine `echo_core` (`out=echo_core`) which takes and returns tokens. A nice side effect is that it echos the full prompt template with system prompt, whereas `echo_full` echos only user prompt.

![image](https://github.com/user-attachments/assets/27ec0a7b-a27d-4e69-96ea-1ffa0822ea90)

72064d84

ci: Add rust checks to missing directories (#239) · c06b95ff
Ryan McCormick authored Feb 25, 2025
```
Signed-off-by: Ryan McCormick <rmccormick@nvidia.com>
```
c06b95ff

refactor: move libs to lib dir · 08fcd7e9

Neelay Shah authored Feb 24, 2025


Signed-off-by: Neelay Shah <neelays@nvidia.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

08fcd7e9

24 Feb, 2025 1 commit
- feat: add rust based tokenizer · 4f6f63cd
  Biswa Panda authored Feb 24, 2025
  
  4f6f63cd