1. 05 Jul, 2024 1 commit
    • Nicolas Patry's avatar
      Refactor dead code - Removing all `flash_xxx.py` files. (#2166) · fb2f74e2
      Nicolas Patry authored
      * Refactor dead code.
      
      * First working step.
      
      * Remove a lot of duplicated code.
      
      * More dead code.
      
      * More cleanup.
      
      * Fix Santacoder test.
      
      * Fixing the simple tests.
      
      * Fixing sharding.
      
      * Fixes for VLM.
      
      * Fixing santacoder (num_kv_heads hardcoded).
      
      * Removing more dead code.
      
      * Fixing `config.n_head`.
      
      * Stopping earlier because of `<end_of_utterance>` in idefics2.
      
      * Addresses comments.
      
      * Removing the dead code.
      
      * Fuse back mistral into FlashCausalLM.
      
      * Finish removal.
      
      * Fixing docs + causal_lm `batch_class`.
      
      * Fixing docs + causal.lm.
      
      * Add default to Gemma Causality.
      
      * Default value for gemma/gemma2.
      
      * Wrong default.
      fb2f74e2
  2. 07 Jun, 2024 1 commit
    • Daniël de Kok's avatar
      server: use chunked inputs · bf3c8137
      Daniël de Kok authored
      The router will now send the input as chunks besides as a single
      string. This change modifies the server to process chunked input
      rather than strings. This also allows us to remove the image
      extraction code from the server.
      bf3c8137
  3. 14 Dec, 2023 1 commit
  4. 02 Jun, 2023 1 commit
  5. 26 May, 2023 1 commit
  6. 09 Apr, 2023 1 commit
  7. 16 Mar, 2023 1 commit
  8. 07 Mar, 2023 1 commit
  9. 02 Feb, 2023 1 commit
  10. 31 Jan, 2023 3 commits
  11. 20 Jan, 2023 2 commits