"examples/python_rs/llm/trtllm/README.md" did not exist on "ecf53ce2b38971dc9b5dde6f70d74cfc5b870c35"
  1. 09 May, 2025 2 commits
  2. 08 May, 2025 1 commit
    • Graham King's avatar
      feat: Qwen3, Gemma3 and Llama4 support (#1002) · ceaeba3e
      Graham King authored
      . New mistralrs and llamacpp version
      . mistralrs: Handle Gemma 3 and Llama 4 as vision models
      . Update the dynamo-run docs to use Qwen 3
      . Our pre-processor now supports Llama 4's newer multi-modal `config.json`
      . Upgrade minijinja to handle Qwen 3's prompt template
      
      For Llama 4 we'll need to limit the max seq len. vllm says:
      > To serve at least one request with the models's max seq len (10485760), (240.00 GiB KV cache is needed,...
      
      I was able to run Llama 4 with llamacpp and a quantized GGUF, with Dynamo doing the pre-processing.
      ceaeba3e
  3. 07 May, 2025 3 commits
  4. 06 May, 2025 3 commits
  5. 05 May, 2025 1 commit
  6. 29 Apr, 2025 2 commits
  7. 28 Apr, 2025 3 commits
  8. 26 Apr, 2025 2 commits
  9. 25 Apr, 2025 2 commits
  10. 24 Apr, 2025 1 commit
  11. 23 Apr, 2025 2 commits
  12. 22 Apr, 2025 1 commit
  13. 21 Apr, 2025 1 commit
  14. 18 Apr, 2025 4 commits
  15. 15 Apr, 2025 3 commits
  16. 11 Apr, 2025 3 commits
  17. 09 Apr, 2025 3 commits
  18. 08 Apr, 2025 1 commit
  19. 07 Apr, 2025 1 commit
  20. 03 Apr, 2025 1 commit