"tests/test_models_unet.py" did not exist on "511bd3aaf2772d09ffe6bfaeca2709682184f920"
  1. 06 Dec, 2024 1 commit
    • Nicolas Patry's avatar
      Auto max prefill (#2797) · 5df80590
      Nicolas Patry authored
      * Attempt at automatic max batch prefill.
      
      * Taking into account number of shards.
      
      * Adding more cards.
      
      * Adding A100 + H100
      
      * Adding a few more cards.
      
      * Logprobs cost too much.
      
      * h100 better name, and keep factor of 2
      
      * Damn inflated sparse tflops.
      
      * Typo in h100.
      
      * Updated the flops calculation (checked with fvcore).
      
      * chunking by default.
      
      * Fix prefix caching for chat completion since we removed logprobs.
      
      * More tests.
      
      * Dropping all the prefill logprobs.
      
      * Add a flag that enables users to get logprobs back.
      
      * Repairing prompt token counting.
      
      * Fixing a few tests.
      
      * Remove some scaffolding.
      
      * Attempting to reduces the issues (workarounds for now).
      5df80590