1. 09 Feb, 2026 1 commit
    • MatejKosec's avatar
      fix: profiler deployment timeout handling for MoE models (#6086) · 67329d10
      MatejKosec authored
      Wrap wait_for_deployment_ready() in try/except TimeoutError for both prefill and decode profiling sweeps
      On timeout: log error, record via add_profiling_error(), clean up the timed-out deployment, and continue to the next parallelization mapping
      Previously, a single deployment timeout would crash the entire profiler job
      67329d10
  2. 06 Feb, 2026 3 commits
  3. 05 Feb, 2026 3 commits
  4. 04 Feb, 2026 2 commits
  5. 03 Feb, 2026 2 commits
  6. 29 Jan, 2026 1 commit
  7. 28 Jan, 2026 1 commit
  8. 27 Jan, 2026 2 commits
  9. 26 Jan, 2026 1 commit
  10. 23 Jan, 2026 1 commit
  11. 21 Jan, 2026 1 commit
  12. 18 Jan, 2026 1 commit
  13. 16 Jan, 2026 3 commits
  14. 15 Jan, 2026 1 commit
  15. 14 Jan, 2026 1 commit
  16. 13 Jan, 2026 1 commit
  17. 12 Jan, 2026 2 commits
  18. 09 Jan, 2026 1 commit
  19. 08 Jan, 2026 1 commit
  20. 07 Jan, 2026 1 commit
  21. 06 Jan, 2026 2 commits
  22. 05 Jan, 2026 1 commit
  23. 02 Jan, 2026 2 commits
  24. 01 Jan, 2026 1 commit
  25. 31 Dec, 2025 2 commits
  26. 23 Dec, 2025 1 commit
  27. 19 Dec, 2025 1 commit