Commits · d849f7eccabdd850e2c7cb5e6103d6f8b39b0a77 · OpenDAS / dynamo

02 Jun, 2025 2 commits
- feat: expose router configurations to dynamo-run (#1259) · d849f7ec
  Hongkuan Zhou authored Jun 02, 2025
  
  d849f7ec
- chore: Remove PreprocessedRequest alias BackendInput (#1307) · 3f6a7472
  Graham King authored Jun 02, 2025
```
It was confusing to have two names for one type.

This tidy up started in #1064 , is now complete.
```
  3f6a7472
28 May, 2025 1 commit
- fix: dynamo-run add warning if block-size different (#1233) · e450c2c7
  Alec authored May 28, 2025
  
  e450c2c7
27 May, 2025 1 commit
- feat(http): add health check endpoint (#1037) · 39d01eac
  ishandhanani authored May 27, 2025
  
  39d01eac
22 May, 2025 1 commit

feat(dynamo-run): Allow setting KV cache block size (#1175) · 183f2b32

Graham King authored May 22, 2025

Example:
```
dynamo-run out=<engine> <model> --kv-cache-block-size 64
```

In a distributed system this goes on the worker node and is propagated to ingress via the model deployment card.

Previously hard coded to 16, which is now the default.

- Load context_length from model. Closes #1172
- Store context length and KV cache block size in Model Deployment Card #1170

183f2b32

21 May, 2025 2 commits
- fix(llmctl): Use ModelWatcher instead of direct etcd operations (#1150) · 3e8e38a9
  Graham King authored May 21, 2025
  
  3e8e38a9
- chore: Fix model removal on instance stop, refactor discovery (#1142) · b520bf44
  Graham King authored May 21, 2025
```
- Stop advertising a model when it's last instance stops. Previously was when any instance stops.
- Faster locks on model manager.
- Move discovery code out of http, as it is used by all inputs.
```
  b520bf44