Commits · d2faf0e6f5ebb0edc9334cd7da4fc1d85f608437 · OpenDAS / dynamo

19 Dec, 2025 2 commits

feat: Runtime media decoder config (#5011) · d2faf0e6
milesial authored Dec 18, 2025
```
Signed-off-by: Alexandre Milesi <milesial@users.noreply.github.com>
```
d2faf0e6

refactor: frontend/grpc tests to use dynamic ports (#4992) · c22280cc

Keiven C authored Dec 18, 2025


Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>

c22280cc

11 Dec, 2025 1 commit
- feat: worker-local KvIndexer in KvEventPublisher (#4519) · 33249945
  Karen Chung authored Dec 11, 2025
```
Co-authored-by: Yan Ru Pei <yanrpei@gmail.com>
```
  33249945
02 Dec, 2025 1 commit
- feat: lora - centralize lora cache key, restructure folders, s3 resiliency (#4644) · 71f94eda
  Biswa Panda authored Dec 02, 2025
  
  71f94eda
19 Nov, 2025 1 commit
- feat: unregister discovery instance (#4459) · 781331c6
  Biswa Panda authored Nov 19, 2025
  
  781331c6
08 Nov, 2025 2 commits
- fix: refactor to use service discovery (#4092) · 09b26bf6
  mohammedabdulwahhab authored Nov 08, 2025
```
Signed-off-by: mohammedabdulwahhab <furkhan324@berkeley.edu>
```
  09b26bf6
- feat: Media decoder and fetcher options in the MDC (#4094) · 14af074e
  milesial authored Nov 07, 2025
```
Signed-off-by: Alexandre Milesi <milesial@users.noreply.github.com>
```
  14af074e
23 Oct, 2025 1 commit
- chore: Use KeyValueStoreManager instead of etcd::Client (#3822) · 7731b024
  Graham King authored Oct 23, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  7731b024
17 Oct, 2025 1 commit
- feat(frontend): Get model config files (`tokenizer.json` et al.) from MX (#3659) · 9d03b8dc
  Graham King authored Oct 17, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  9d03b8dc
16 Oct, 2025 1 commit
- feat: dp rank routing (#3597) · f978f4d1
  Yan Ru Pei authored Oct 15, 2025
```
Signed-off-by: PeaBrane <yanrpei@gmail.com>
```
  f978f4d1
15 Oct, 2025 1 commit
- feat: Python binding to download a model. (#3593) · ab0da582
  Graham King authored Oct 15, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  ab0da582
11 Oct, 2025 1 commit
- feat: implement custom backend metrics for NIM (#3266) · 65cc5337
  Keiven C authored Oct 10, 2025
```
Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>
```
  65cc5337
10 Oct, 2025 2 commits
- chore: Remove model_config from LocalModel (#3558) · 0e0218ff
  Graham King authored Oct 10, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  0e0218ff
- feat: Introduce storage_client in DistributedRuntime (#3507) · 7a7d397c
  Graham King authored Oct 10, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  7a7d397c
08 Oct, 2025 1 commit
- chore: Remove GGUF support (#3488) · 1b1265e6
  Graham King authored Oct 08, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  1b1265e6
07 Oct, 2025 1 commit
- chore(discovery): Watch/publish ModelDeploymentCard instead of ModelEntry (#3350) · 81162dfe
  Graham King authored Oct 07, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  81162dfe
30 Sep, 2025 2 commits
- chore: Add Key abstraction in our KeyValueStore (#3322) · 50cdae5f
  Graham King authored Sep 30, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  50cdae5f
- chore: Move model_input, model_type from ModelEntry to ModelDeploymentCard (#3292) · 6ffd20a8
  Graham King authored Sep 30, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  6ffd20a8
17 Sep, 2025 1 commit
- feat: Make part of discovery re-usable (#3073) · 9060ce12
  Graham King authored Sep 17, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  9060ce12
16 Sep, 2025 1 commit
- fix: Interactive inputs actually stops, does not ignore stop token (#3057) · 87e6e052
  Graham King authored Sep 16, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  87e6e052
05 Sep, 2025 1 commit
- fix: Load the tokenizer JSON once for chat and completions. (#2910) · cb5a657a
  Graham King authored Sep 05, 2025
```
Signed-off-by: Graham King <grahamk@nvidia.com>
```
  cb5a657a
03 Sep, 2025 3 commits

refactor: Split ModelType to ModelInput for request and response type;... · 27fad26f

Olga Andreeva authored Sep 03, 2025

refactor: Split ModelType to ModelInput for request and response type; ModelType for the supported workloads (#2714)
Signed-off-by: Guan Luo <gluo@nvidia.com>
Signed-off-by: GuanLuo <41310872+GuanLuo@users.noreply.github.com>
Co-authored-by: Guan Luo <gluo@nvidia.com>
Co-authored-by: GuanLuo <41310872+GuanLuo@users.noreply.github.com>

27fad26f

feat: Add --custom-jinja-template argument to pass a custom chat template for vLLM (#2829) · c920cbd9
KrishnanPrash authored Sep 03, 2025
```
Signed-off-by: Krishnan Prashanth <kprashanth@nvidia.com>
```
c920cbd9
feat: dynamo namespace isolation (#2394) · c6becbc8
Biswa Panda authored Sep 03, 2025
```
Signed-off-by: Biswa Panda <biswa.panda@gmail.com>
```
c6becbc8

25 Aug, 2025 1 commit
- feat: enable --dyn-reasoning-parser flag to set reasoning parser for vllm deployments (#2700) · f5a41004
  nachiketb-nvidia authored Aug 25, 2025
  
  f5a41004
22 Aug, 2025 3 commits
- chore: Rust to 1.89 and edition 2024 (#2659) · bce74588
  Graham King authored Aug 22, 2025
  
  bce74588
- feat: [vLLM] implement cli args for tool and reasoning parsers (#2619) · cbe854fc
  Ayush Agarwal authored Aug 22, 2025
  
  cbe854fc
- chore(llm): Rename protocols::Endpoint to EndpointId (#2615) · 6a358f7c
  Graham King authored Aug 22, 2025
  
  6a358f7c
19 Aug, 2025 1 commit
- feat(frontend): support setting HTTP host via CLI (--http-host) (#2523) · c5d9d267
  suzu authored Aug 19, 2025
  
  c5d9d267
18 Aug, 2025 1 commit
- feat(http): TLS support (#2492) · a4bbe492
  Graham King authored Aug 18, 2025
  
  a4bbe492
14 Aug, 2025 1 commit
- feat: add RuntimeConfig to ModelEntry (#2311) · d0a63635
  Jorge António authored Aug 14, 2025
```
Co-authored-by: Yan Ru Pei <yanrpei@gmail.com>
```
  d0a63635
13 Aug, 2025 1 commit
- feat: Allow an endpoint to serve multiple models (#2418) · 72ec5f5c
  Graham King authored Aug 13, 2025
  
  72ec5f5c
07 Aug, 2025 1 commit
- chore: Remove service_name from ModelDeploymentCard (#2349) · 1954fcfa
  Graham King authored Aug 07, 2025
  
  1954fcfa
05 Aug, 2025 1 commit
- feat: Pass user_data to register_llm for LoRA support (#2286) · 433f6012
  Chi authored Aug 05, 2025
  
  433f6012
31 Jul, 2025 1 commit
- feat: skip downloading model weights if using mocker (only tokenizer) (#2213) · bae25dc6
  Yan Ru Pei authored Jul 31, 2025
  
  bae25dc6
18 Jul, 2025 2 commits
- feat: Add migration to LLM requests (#1930) · 1f07dab7
  Jacky authored Jul 18, 2025
  
  1f07dab7
- feat(frontend): router-mode settings (#2001) · fc124360
  Graham King authored Jul 18, 2025
  
  fc124360
08 Jul, 2025 1 commit
- feat(python): Python bindings for the Dynamo CLI tools (#1799) · 2bf27924
  Graham King authored Jul 08, 2025
  
  2bf27924
01 Jul, 2025 1 commit

fix(bindings): Default router config in bindings (#1716) · edf00c5c

Graham King authored Jul 01, 2025

  * Added a default temperature value for text generation requests when no temperature is specified.
  * Improved handling of missing configuration values to prevent errors during model initialization.

edf00c5c

30 Jun, 2025 1 commit

chore(dynamo-run): Refactor to library (#1687) · 92f06b0e

Graham King authored Jun 30, 2025

Move much of what was in the `dynamo-run` crate into `dynamo-llm` so that everyone can use it.

Example usage:

1. Create a `LocalModel`:

```
    let local_model = LocalModelBuilder::default()
	.model_path("Qwen/Qwen3-0.6B")
	.http_port(8080)
	.build().await?;
```

2. Make an engine:

```
    let engine_config = EngineConfig::StaticFull {
	engine: dynamo_engine_mistralrs::make_engine(&local_model).await?,
	model: Box::new(local_model),
    };
```

3. Connect it to an input and run it

```
    dynamo_llm::entrypoint::input::run_input(Input::Http, runtime, engine_config).await?;
```

For https://github.com/ai-dynamo/dynamo/issues/1647

Code Rabbit summary, thanks:
  * Introduced a flexible builder pattern for local model configuration, allowing advanced customization and easier initialization.
  * Added new input modes and unified input handling, supporting interactive chat, HTTP server, batch file, and distributed endpoint modes.
  * Centralized engine configuration and routing, enabling more extensible and maintainable engine management.
  * Simplified and modularized the codebase by moving input and engine logic into dedicated modules.
  * Replaced direct model construction with an asynchronous builder for improved clarity and extensibility.
  * Streamlined configuration and validation for flags and router settings.
  * Added validation to prevent incompatible input and output combinations in endpoint and dynamic modes.

92f06b0e