Commits · 7a341f86f5606d36cd7a7a60b027d4f078765c22 · OpenDAS / dynamo

"vllm/vscode:/vscode.git/clone" did not exist on "b10850444604a901da6cbabd4842a44d190cf35f"

08 Jul, 2025 1 commit
- feat: simplify k8s deployment (#1708) · 7a341f86
  julienmancuso authored Jul 07, 2025
  
  7a341f86
07 Jul, 2025 10 commits
- feat: add crds for vllm and llm examples (#1766) · 5505507b
  mohammedabdulwahhab authored Jul 07, 2025
```
Signed-off-by: mohammedabdulwahhab <furkhan324@berkeley.edu>
Co-authored-by: Hannah Zhang <hannahz@nvidia.com>
Co-authored-by: hhzhang16 <54051230+hhzhang16@users.noreply.github.com>
```
  5505507b
- feat: vllm speculative decoding metrics (#1549) · 439e977d
  jain-ria authored Jul 07, 2025
```
Signed-off-by: jain-ria <riajain@NVIDIA.com>
Co-authored-by: Alec <35311602+alec-flowers@users.noreply.github.com>
```
  439e977d
- fix: for handling zombie process (#1801) · a9c0e0c7
  Neelay Shah authored Jul 07, 2025
  
  a9c0e0c7
- fix: include load and sla planner doc (#1803) · c9a60278
  Hongkuan Zhou authored Jul 07, 2025
  
  c9a60278
- docs: Remove the outdated limitation (#1802) · b1509ea7
  Tanmay Verma authored Jul 07, 2025
  
  b1509ea7
- chore: update versions for 0.3.2 release (#1793) · c4935b34
  Anant Sharma authored Jul 07, 2025
  
  c4935b34
- fix: add setuptools dependency (#1792) · 533b8cee
  Anant Sharma authored Jul 07, 2025
  
  533b8cee
- feat: Failure Detection while Responses are returning (#1671) · b4ddca99
  Jacky authored Jul 07, 2025
  
  b4ddca99
- feat: add flush_cache endpoint to sglang (#1769) · bd91dca6
  ishandhanani authored Jul 07, 2025
  
  bd91dca6
- fix: Set TRTLLM_USE_UCX_KVCACHE by default in container image (#1777) · b2044566
  Tanmay Verma authored Jul 07, 2025
  
  b2044566
06 Jul, 2025 1 commit
- feat: automate slurm handling in sglang example. (#1730) · 1630f8ba
  fsaady authored Jul 06, 2025
```
Signed-off-by: Fadi Saady <fsaady@nvidia.com>
```
  1630f8ba
04 Jul, 2025 1 commit
- docs: draft glossary for review (#1722) · dda59e31
  Kristen Kelleher authored Jul 03, 2025
  
  dda59e31
03 Jul, 2025 10 commits
- chore: merge attributions for 0.3.1 release (#1590) (#1763) · bd0d67d3
  Anant Sharma authored Jul 03, 2025
  
  bd0d67d3
- chore: update nixl to latest 0.3.1 commit (#1762) · a9241b61
  Anant Sharma authored Jul 03, 2025
  
  a9241b61
- chore(engines): Upgrade mistralrs to 0.6.0 (#1767) · 4ab47617
  Graham King authored Jul 03, 2025
  
  4ab47617
- feat: Add experimental WideEP + EPLB dis-aggregated example for TRTLLM (#1690) · 7a353e61
  Ryan McCormick authored Jul 04, 2025
```
Co-authored-by: tanmayv25 <tanmay2592@gmail.com>
```
  7a353e61
- feat: Implement frontend tokenization for embedding requests (#1494) · 47e7fde7
  Tom O'Brien authored Jul 03, 2025
  
  47e7fde7
- test: fault tolerance tests (#1444) · 36f03d40
  Neelay Shah authored Jul 03, 2025
```
Signed-off-by: Neelay Shah <neelays@nvidia.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
```
  36f03d40
- feat: graceful shutdown for sglang example (#1764) · fb213a2f
  Hongkuan Zhou authored Jul 03, 2025
  
  fb213a2f
- chore(sglang): readme and instruction fixes (#1761) · 8bfc61ac
  ishandhanani authored Jul 03, 2025
  
  8bfc61ac
- docs: merge changes from 0.3.1 release (#1543) (#1759) · 6901c7c0
  Anant Sharma authored Jul 03, 2025
```
Co-authored-by: Kristen Kelleher <kkelleher@nvidia.com>
```
  6901c7c0
- chore: sgl container hash and python path bump (#1741) · 2f38e10f
  ishandhanani authored Jul 02, 2025
  
  2f38e10f
02 Jul, 2025 4 commits
- feat: add dynamo components for sglang (#1721) · 9cbf8031
  ishandhanani authored Jul 02, 2025
  
  9cbf8031
- fix: Add handling for ignore_eos sampling param in trtllm example base engine (#1726) · 008bb1e6
  Indrajit Bhosale authored Jul 02, 2025
  
  008bb1e6
- docs: Add GitHub Pages deployment to private website for release branches (#1717) · 45f0e424
  Meenakshi Sharma authored Jul 02, 2025
```
Signed-off-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
```
  45f0e424
- chore: fix typo for dynamo-run docs (#1720) · 7fd379a7
  Zhongdongming Dai authored Jul 02, 2025
  
  7fd379a7
01 Jul, 2025 10 commits
- feat: Validation engine for validating OpenAI api request data (#1674) · ee86bad3
  Nathan Barry authored Jul 01, 2025
  
  ee86bad3
- feat: vllm mocker enhancement (#1236) · f0652d89
  Yan Ru Pei authored Jul 01, 2025
  
  f0652d89
- feat: add grafana dcgm dashboard config file (#1701) · 0d6cae85
  sanshang-nv authored Jul 02, 2025
  
  0d6cae85
- chore: add sglang codeowners (#1719) · d4676f8a
  ishandhanani authored Jul 01, 2025
  
  d4676f8a
- fix: default to None initialization of routing config (#1713) · 0a32b344
  Alec authored Jul 01, 2025
  
  0a32b344
- fix: Prometheus to pull from dcgm-exporter:9400 instead of 9401 (#1707) · 54c21168
  Keiven C authored Jul 01, 2025
```
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
```
  54c21168
- feat: Support for Responses API (#1694) · dfbd741d
  Paul Hendricks authored Jul 01, 2025
  
  dfbd741d
- fix(bindings): Default router config in bindings (#1716) · edf00c5c
  Graham King authored Jul 01, 2025
```
  * Added a default temperature value for text generation requests when no temperature is specified.
  * Improved handling of missing configuration values to prevent errors during model initialization.
```
  edf00c5c
- fix: Fix main (#1712) · 6365a015
  jthomson04 authored Jun 30, 2025
  
  6365a015
- feat: Approximate KV Routing (#1636) · aaf283bb
  jthomson04 authored Jun 30, 2025
  
  aaf283bb
30 Jun, 2025 3 commits

fix: bump vLLM commit and revert side channel host change for DS R1 DEP deployment (#1696) · 9cd9993d
GuanLuo authored Jun 30, 2025
```
Signed-off-by: GuanLuo <41310872+GuanLuo@users.noreply.github.com>
```
9cd9993d
feat: support sla planner in vllm_v1 example (#1680) · 2bed47eb
Hongkuan Zhou authored Jun 30, 2025

2bed47eb

chore(dynamo-run): Refactor to library (#1687) · 92f06b0e

Graham King authored Jun 30, 2025

Move much of what was in the `dynamo-run` crate into `dynamo-llm` so that everyone can use it.

Example usage:

1. Create a `LocalModel`:

```
    let local_model = LocalModelBuilder::default()
	.model_path("Qwen/Qwen3-0.6B")
	.http_port(8080)
	.build().await?;
```

2. Make an engine:

```
    let engine_config = EngineConfig::StaticFull {
	engine: dynamo_engine_mistralrs::make_engine(&local_model).await?,
	model: Box::new(local_model),
    };
```

3. Connect it to an input and run it

```
    dynamo_llm::entrypoint::input::run_input(Input::Http, runtime, engine_config).await?;
```

For https://github.com/ai-dynamo/dynamo/issues/1647

Code Rabbit summary, thanks:
  * Introduced a flexible builder pattern for local model configuration, allowing advanced customization and easier initialization.
  * Added new input modes and unified input handling, supporting interactive chat, HTTP server, batch file, and distributed endpoint modes.
  * Centralized engine configuration and routing, enabling more extensible and maintainable engine management.
  * Simplified and modularized the codebase by moving input and engine logic into dedicated modules.
  * Replaced direct model construction with an asynchronous builder for improved clarity and extensibility.
  * Streamlined configuration and validation for flags and router settings.
  * Added validation to prevent incompatible input and output combinations in endpoint and dynamic modes.

92f06b0e