1. 11 Sep, 2024 1 commit
    • Nicolas Patry's avatar
      Fix tokenization yi (#2507) · dae3bf1d
      Nicolas Patry authored
      * Fixing odd tokenization self modifications on the Rust side (load and
      resave in Python).
      
      * Fixing the builds ?
      
      * Fix the gh action?
      
      * Fixing the location ?
      
      * Validation is odd.
      
      * Try a faster runner
      
      * Upgrade python version.
      
      * Remove sccache
      
      * No sccache.
      
      * Getting libpython maybe ?
      
      * List stuff.
      
      * Monkey it up.
      
      * have no idea at this point
      
      * Tmp.
      
      * Shot in the dark.
      
      * Tmate the hell out of this.
      
      * Desperation.
      
      * WTF.
      
      * -y.
      
      * Apparently 3.10 is not available anymore.
      
      * Updating the dockerfile to make libpython discoverable at runtime too.
      
      * Put back rust tests.
      
      * Why do we want mkl on AMD ?
      
      * Forcing 3.11 ?
      dae3bf1d
  2. 05 Jul, 2024 1 commit
    • Nicolas Patry's avatar
      Refactor dead code - Removing all `flash_xxx.py` files. (#2166) · fb2f74e2
      Nicolas Patry authored
      * Refactor dead code.
      
      * First working step.
      
      * Remove a lot of duplicated code.
      
      * More dead code.
      
      * More cleanup.
      
      * Fix Santacoder test.
      
      * Fixing the simple tests.
      
      * Fixing sharding.
      
      * Fixes for VLM.
      
      * Fixing santacoder (num_kv_heads hardcoded).
      
      * Removing more dead code.
      
      * Fixing `config.n_head`.
      
      * Stopping earlier because of `<end_of_utterance>` in idefics2.
      
      * Addresses comments.
      
      * Removing the dead code.
      
      * Fuse back mistral into FlashCausalLM.
      
      * Finish removal.
      
      * Fixing docs + causal_lm `batch_class`.
      
      * Fixing docs + causal.lm.
      
      * Add default to Gemma Causality.
      
      * Default value for gemma/gemma2.
      
      * Wrong default.
      fb2f74e2
  3. 07 Jun, 2024 1 commit
    • Daniël de Kok's avatar
      server: use chunked inputs · bf3c8137
      Daniël de Kok authored
      The router will now send the input as chunks besides as a single
      string. This change modifies the server to process chunked input
      rather than strings. This also allows us to remove the image
      extraction code from the server.
      bf3c8137
  4. 14 Dec, 2023 1 commit
  5. 11 Dec, 2023 2 commits
  6. 08 Jun, 2023 1 commit
  7. 02 Jun, 2023 1 commit
  8. 26 May, 2023 1 commit
  9. 24 May, 2023 1 commit
  10. 24 Apr, 2023 2 commits
  11. 20 Apr, 2023 1 commit
  12. 09 Apr, 2023 1 commit
  13. 16 Mar, 2023 1 commit
  14. 07 Mar, 2023 1 commit
  15. 24 Feb, 2023 1 commit
  16. 03 Feb, 2023 1 commit
  17. 02 Feb, 2023 1 commit
  18. 31 Jan, 2023 3 commits
  19. 20 Jan, 2023 1 commit
  20. 15 Dec, 2022 1 commit
  21. 12 Dec, 2022 1 commit
  22. 08 Dec, 2022 1 commit