1. 21 May, 2025 2 commits
  2. 19 May, 2025 2 commits
    • Graham King's avatar
      feat: Support multiple models on single ingress node (#1127) · aeb79e62
      Graham King authored
      We can now do this:
      
      - Node 1:
      
      ```
      dynamo-run in=http out=dyn
      ```
      
      - Node 2 and 3, two instances of component 'backend' in the nemotron_ultra pipeline:
      
      ```
      dynamo-run in=dyn://nemotron_ultra.backend.generate out=vllm /data/models/NemotronUltra
      ```
      
      - Node 4 and 5, two instances of the 'backend' component in nemotron_super pipeline:
      
      ```
      dynamo-run in=dyn://nemotron_super.backend.generate out=vllm /data/models/NemotronSuper
      ```
      
      The ingress node will discover all four instances and route correctly. We have been planning for this for a long time now.
      
      As part of this auto-discovery is now always `out=dyn`, with no extra URL parts. Previously it could only route to a single pipeline.
      
      Also:
      - Refactor endpoint / instance naming now that I understand them
      - Fix removing models when their instance stops.
      aeb79e62
    • Tom O'Brien's avatar
      feat: Add OpenAI Embeddings interface in rust lib (#1110) · 73fdfb8a
      Tom O'Brien authored
      Implements OpenAI embeddings (interface only).
      
      - Adds ModelType::Embedding
      - Adds OpenAI embedding request/response structs
      - Adds support for embedding model discovery
      73fdfb8a
  3. 15 May, 2025 2 commits
    • Graham King's avatar
      fix: Fix default RouterMode value (#1092) · 889ab67e
      Graham King authored
      The Python bindings use the default value for RouterMode. Previously that was Random (good), but now it became None (bad).
      
      Remove the option and clean up the duplicate RouterMode. I was trying to avoid putting the `KV` enum in dynamo-runtime. Turns out adding those two characters gives us a healthy simplification, and restores the old default router value.
      
      Also clean up two noisy log messages when waiting for KV routing metrics to start in worker.
      889ab67e
    • Abrar Shivani's avatar
      feat: Use existing Tokio runtime in components (#941) · 2a5eb7e7
      Abrar Shivani authored
      The runtime library already provides a from_current method that creates and returns a Runtime object initialized with the current Tokio runtime handle. Since components do not use the runtime library directly but access it through the worker, the worker needs to be updated to create itself using a Runtime instance derived from the current Tokio runtime.
      This PR updates the http component and the worker to use the existing Tokio runtime instead of creating a new one. Other components can be similarly updated to run using the existing runtime.
      2a5eb7e7
  4. 14 May, 2025 1 commit
  5. 01 May, 2025 1 commit
  6. 04 Apr, 2025 1 commit
    • Graham King's avatar
      feat: Python decorator dynamo_worker takes optional `static` parameter without etcd (#494) · 88ad3425
      Graham King authored
      Adds `@dynamo_worker(static = True)` to create a static worker which has a predictable name and hence does not require discovery or `etcd` to be running. There can only be a single static worker per namespace / component / endpoint trio.
      
      This contrasts with the default dynamic `dynamo_worker` endpoints we have now, which get a unique random name (based on namespace/component/endpoint), and are discovered by ingress components using etcd.
      
      Also change the hello_world example to use `dynamo_worker(static = True)` so that it is exercised and demonstrated somewhere.
      
      For NIM.
      88ad3425
  7. 11 Mar, 2025 1 commit
  8. 08 Mar, 2025 1 commit
  9. 05 Mar, 2025 1 commit
  10. 25 Feb, 2025 3 commits
  11. 20 Feb, 2025 1 commit
  12. 18 Feb, 2025 1 commit