1. 05 Jul, 2024 1 commit
    • Nicolas Patry's avatar
      Refactor dead code - Removing all `flash_xxx.py` files. (#2166) · fb2f74e2
      Nicolas Patry authored
      * Refactor dead code.
      
      * First working step.
      
      * Remove a lot of duplicated code.
      
      * More dead code.
      
      * More cleanup.
      
      * Fix Santacoder test.
      
      * Fixing the simple tests.
      
      * Fixing sharding.
      
      * Fixes for VLM.
      
      * Fixing santacoder (num_kv_heads hardcoded).
      
      * Removing more dead code.
      
      * Fixing `config.n_head`.
      
      * Stopping earlier because of `<end_of_utterance>` in idefics2.
      
      * Addresses comments.
      
      * Removing the dead code.
      
      * Fuse back mistral into FlashCausalLM.
      
      * Finish removal.
      
      * Fixing docs + causal_lm `batch_class`.
      
      * Fixing docs + causal.lm.
      
      * Add default to Gemma Causality.
      
      * Default value for gemma/gemma2.
      
      * Wrong default.
      fb2f74e2
  2. 07 Jun, 2024 1 commit
    • Daniël de Kok's avatar
      server: use chunked inputs · bf3c8137
      Daniël de Kok authored
      The router will now send the input as chunks besides as a single
      string. This change modifies the server to process chunked input
      rather than strings. This also allows us to remove the image
      extraction code from the server.
      bf3c8137
  3. 14 Dec, 2023 1 commit
  4. 11 Dec, 2023 2 commits
  5. 08 Jun, 2023 1 commit
  6. 02 Jun, 2023 1 commit
  7. 26 May, 2023 1 commit
  8. 24 May, 2023 1 commit
  9. 24 Apr, 2023 2 commits
  10. 20 Apr, 2023 1 commit
  11. 09 Apr, 2023 1 commit
  12. 16 Mar, 2023 1 commit
  13. 07 Mar, 2023 1 commit
  14. 24 Feb, 2023 1 commit
  15. 03 Feb, 2023 1 commit
  16. 02 Feb, 2023 1 commit
  17. 31 Jan, 2023 3 commits
  18. 20 Jan, 2023 1 commit
  19. 15 Dec, 2022 1 commit
  20. 12 Dec, 2022 1 commit
  21. 08 Dec, 2022 1 commit