1. 14 Jun, 2024 3 commits
    • Daniel Hiltgen's avatar
      Harden unload for empty runners · 48702dd1
      Daniel Hiltgen authored
      48702dd1
    • Daniel Hiltgen's avatar
      Support forced spreading for multi GPU · 5e8ff556
      Daniel Hiltgen authored
      Our default behavior today is to try to fit into a single GPU if possible.
      Some users would prefer the old behavior of always spreading across
      multiple GPUs even if the model can fit into one.  This exposes that
      tunable behavior.
      5e8ff556
    • Daniel Hiltgen's avatar
      Improve multi-gpu handling at the limit · 6fd04ca9
      Daniel Hiltgen authored
      Still not complete, needs some refinement to our prediction to understand the
      discrete GPUs available space so we can see how many layers fit in each one
      since we can't split one layer across multiple GPUs we can't treat free space
      as one logical block
      6fd04ca9
  2. 13 Jun, 2024 2 commits
  3. 12 Jun, 2024 1 commit
  4. 10 Jun, 2024 2 commits
  5. 07 Jun, 2024 1 commit
  6. 06 Jun, 2024 3 commits
  7. 05 Jun, 2024 1 commit
  8. 04 Jun, 2024 7 commits
  9. 24 May, 2024 2 commits
  10. 23 May, 2024 1 commit
  11. 21 May, 2024 1 commit
  12. 20 May, 2024 4 commits
  13. 16 May, 2024 1 commit
  14. 15 May, 2024 1 commit
  15. 14 May, 2024 10 commits