Commits · d3ca7661129e5b2f0703965650028f5783f2de52 · OpenDAS / dynamo

"vllm/tool_parsers/__init__.py" did not exist on "0cf893cae174d634d03c5b399bc6787bf2d5a6cc"

02 Jun, 2025 1 commit
- feat: Make llama.cpp Gnu OpenMP dependency optional (#1331) · d3ca7661
  Graham King authored Jun 02, 2025
```
Do not include by default as it needs libgomp1 at runtime. Add a feature to enable it at build time.
```
  d3ca7661
29 May, 2025 1 commit

feat: Initial Granite support (#1271) · 7d0c9386

Graham King authored May 29, 2025

- Add Granite to our tokenizer
- Fix pre-processor to load context length correctly
- Add strftime_now Jinja function for prompt templates
- Update llama.cpp
- Handle trtllm errors when not using trtllm

Support depends on the engine:

- `mistral.rs`, our default engine, doesn't support Granite yet.

- `llama.cpp` does and works very well:
```
dynamo-run out=llamacpp ~/llms/granite-3.3-2b-instruct-Q4_K_M.gguf --context-length 16384
```

- `vllm` also works very well:
```
dynamo-run in=http out=vllm ~/llms/granite-3.3-2b-instruct --context-length 16384
```

- `sglang` mostly works, but it doesn't catch the stop token, so we do in the HTTP ingress, and log an error. The Text ingress doesn't catch it because I disabled it to make the raw echo engine work. A bit of work to do here.

Closes: #1245

7d0c9386

08 May, 2025 1 commit

feat: Qwen3, Gemma3 and Llama4 support (#1002) · ceaeba3e

Graham King authored May 08, 2025

. New mistralrs and llamacpp version
. mistralrs: Handle Gemma 3 and Llama 4 as vision models
. Update the dynamo-run docs to use Qwen 3
. Our pre-processor now supports Llama 4's newer multi-modal `config.json`
. Upgrade minijinja to handle Qwen 3's prompt template

For Llama 4 we'll need to limit the max seq len. vllm says:
> To serve at least one request with the models's max seq len (10485760), (240.00 GiB KV cache is needed,...

I was able to run Llama 4 with llamacpp and a quantized GGUF, with Dynamo doing the pre-processing.

ceaeba3e

03 Apr, 2025 1 commit

refactor: migrate engines to standalone crates (#453) · 84985d3f

Ryan Olson authored Apr 03, 2025

Moved all of `lib/llm/src/engines` to their own crates as e.g. `lib/engines/mistralrs`. This will allow publishing of the `dynamo-llm` crate as it won't have any github dependencies.

The only engines in dynamo-llm will be the demo `echo` ones.
Co-authored-by: Graham King <grahamk@nvidia.com>

84985d3f

13 Mar, 2025 1 commit
- build: add top level rust workspace (#137) · 3d292851
  Anant Sharma authored Mar 13, 2025
  
  3d292851
11 Mar, 2025 1 commit
- refactor: Move rust binaries out of examples, update nixl dockerfile (#89) · e5db9e86
  Neelay Shah authored Mar 11, 2025
```
Co-authored-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com>
```
  e5db9e86
10 Mar, 2025 1 commit
- chore: update wheel name and reset versions (#73) · fc4da345
  Anant Sharma authored Mar 10, 2025
  
  fc4da345
09 Mar, 2025 1 commit

chore: left over renaming (#67) · 678cffb4

Neelay Shah authored Mar 09, 2025


Co-authored-by: Harrison Saturley-Hall <454891+saturley-hall@users.noreply.github.com>
Co-authored-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>

678cffb4

08 Mar, 2025 1 commit
- chore: rename dynamo (#44) · 602352ce
  Neelay Shah authored Mar 08, 2025
```
Co-authored-by: Biswa Panda <biswa.panda@gmail.com>
```
  602352ce
05 Mar, 2025 1 commit
- refactor: rename triton_distributed to dynemo (#22) · 1af7433b
  Neelay Shah authored Mar 05, 2025
```
Co-authored-by: Graham King <grahamk@nvidia.com>
```
  1af7433b
25 Feb, 2025 1 commit

refactor: move libs to lib dir · 08fcd7e9

Neelay Shah authored Feb 24, 2025


Signed-off-by: Neelay Shah <neelays@nvidia.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

08fcd7e9

21 Feb, 2025 1 commit

feat: event plane + count · 3b7a462d

Ryan Olson authored Feb 21, 2025


Signed-off-by: Ryan Olson <ryanolson@users.noreply.github.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

3b7a462d

18 Feb, 2025 1 commit
- feat: http + llmctl (#181) · d0d35a9e
  Ryan Olson authored Feb 18, 2025
```
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
```
  d0d35a9e
13 Feb, 2025 1 commit
- fix: tcp updates + initial zmq (#176) · 2fd6592f
  Ryan Olson authored Feb 13, 2025
  
  2fd6592f
12 Feb, 2025 1 commit

fix: tcp retry and error handling updates (#169) · dddebc0d

Ryan Olson authored Feb 12, 2025


Signed-off-by: Ryan Olson <ryanolson@users.noreply.github.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>

dddebc0d

11 Feb, 2025 1 commit
- chore: update rust versions to v0.2.0 (#155) · 2e409565
  Anant Sharma authored Feb 10, 2025
```
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
```
  2e409565
10 Feb, 2025 1 commit

feat: OpenAI compatible http service (#123) · ffc6dde1

Ryan Olson authored Feb 10, 2025


Signed-off-by: Ryan Olson <ryanolson@users.noreply.github.com>
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
Co-authored-by: Neelay Shah <neelays@nvidia.com>

ffc6dde1

06 Feb, 2025 1 commit
- add Readme, fix formatting (#120) · b3646497
  Alec authored Feb 05, 2025
```
Co-authored-by: aflowers <aflowers@nvidia.com>
```
  b3646497
05 Feb, 2025 2 commits
- ci: Add Copyright Verification Scripts w/ Automation (#110) · c9130f8f
  J Wyman authored Feb 05, 2025
  
  c9130f8f
- feat: add python bindings + wheel build (#94) · 03b0101e
  Ryan Olson authored Feb 05, 2025
```
Co-authored-by: Ryan McCormick <rmccormick@nvidia.com>
Co-authored-by: Neelay Shah <neelays@nvidia.com>
```
  03b0101e
04 Feb, 2025 1 commit
- feat: rust - initial commit · 5ed8c1c0
  Ryan Olson authored Feb 03, 2025
```
the journey begins
```
  5ed8c1c0