1. 30 May, 2025 1 commit
  2. 29 May, 2025 1 commit
    • J Wyman's avatar
      docs: Update Multimodal Example README (#1275) · fb4bf252
      J Wyman authored
      This change corrects the README.md file in the examples/multimodal folder:
      - Correct "vllm worker" to "decode worker"
      - Correct assertion that data is moved via NATS when embeddings are moved via RDMA.
      
      Additionally, this change updates the textual graphs with Mermaid graphs for improved presentation on github.com.
      fb4bf252
  3. 28 May, 2025 2 commits
  4. 27 May, 2025 1 commit
  5. 21 May, 2025 1 commit
  6. 19 May, 2025 1 commit
    • Graham King's avatar
      feat: Support multiple models on single ingress node (#1127) · aeb79e62
      Graham King authored
      We can now do this:
      
      - Node 1:
      
      ```
      dynamo-run in=http out=dyn
      ```
      
      - Node 2 and 3, two instances of component 'backend' in the nemotron_ultra pipeline:
      
      ```
      dynamo-run in=dyn://nemotron_ultra.backend.generate out=vllm /data/models/NemotronUltra
      ```
      
      - Node 4 and 5, two instances of the 'backend' component in nemotron_super pipeline:
      
      ```
      dynamo-run in=dyn://nemotron_super.backend.generate out=vllm /data/models/NemotronSuper
      ```
      
      The ingress node will discover all four instances and route correctly. We have been planning for this for a long time now.
      
      As part of this auto-discovery is now always `out=dyn`, with no extra URL parts. Previously it could only route to a single pipeline.
      
      Also:
      - Refactor endpoint / instance naming now that I understand them
      - Fix removing models when their instance stops.
      aeb79e62
  7. 09 May, 2025 1 commit
  8. 07 May, 2025 1 commit
  9. 02 May, 2025 1 commit