"lib/bindings/python/vscode:/vscode.git/clone" did not exist on "437cae0ad0c58ce1ea755b7028841969fddbe8c5"
  1. 19 Aug, 2025 1 commit
  2. 18 Aug, 2025 1 commit
  3. 15 Aug, 2025 1 commit
  4. 14 Aug, 2025 1 commit
  5. 13 Aug, 2025 1 commit
  6. 12 Aug, 2025 1 commit
  7. 11 Aug, 2025 1 commit
  8. 07 Aug, 2025 1 commit
  9. 18 Jul, 2025 2 commits
  10. 17 Jul, 2025 1 commit
  11. 09 Jul, 2025 1 commit
  12. 07 Jul, 2025 1 commit
  13. 03 Jul, 2025 1 commit
  14. 01 Jul, 2025 2 commits
  15. 26 Jun, 2025 4 commits
  16. 25 Jun, 2025 2 commits
  17. 24 Jun, 2025 2 commits
  18. 11 Jun, 2025 1 commit
  19. 04 Jun, 2025 2 commits
  20. 03 Jun, 2025 1 commit
  21. 02 Jun, 2025 1 commit
  22. 29 May, 2025 1 commit
  23. 23 May, 2025 1 commit
  24. 19 May, 2025 1 commit
  25. 14 May, 2025 1 commit
  26. 24 Mar, 2025 1 commit
  27. 17 Mar, 2025 1 commit
    • Graham King's avatar
      fix(vllm,sglang): Let the engine enforce max tokens (#216) · 05765cd4
      Graham King authored
      Previously several parts of the stack ensured max tokens (for this single request) was set.
      
      Now only text input sets it (to 8k). Everything else leaves as is, potentially blank. The engines themselves have very small defaults, 16 for vllm and 128 for sglang.
      
      Also fix dynamo-run CUDA startup message to only print if we're using an engine that would benefit from it (mistralrs, llamacpp).
      05765cd4
  28. 14 Mar, 2025 1 commit
  29. 08 Mar, 2025 1 commit
  30. 05 Mar, 2025 1 commit
  31. 02 Mar, 2025 1 commit
  32. 28 Feb, 2025 1 commit