1. 12 Jun, 2023 1 commit
    • A.J's avatar
      docs(launcher): fix CUDA_VISIBLE_DEVICES helper comment (#441) · d4eb60f4
      A.J authored
      # What does this PR do?
      It solves a typo in the comment sections referencing the environment
      variable `CUDA_VISIBLE_DEVICES`. No misspelling references to this
      variable have been found in code logic leading to undefined behaviour or
      bugs. This PR is not expected to perform any code logic modification.
      d4eb60f4
  2. 09 Jun, 2023 1 commit
  3. 08 Jun, 2023 1 commit
    • Nicolas Patry's avatar
      feat(server): Rework model loading (#344) · abd58ff8
      Nicolas Patry authored
      # What does this PR do?
      
      Reworked the loading logic. Idea is to use cleaner loading code:
      
      - Remove need for `no_init_weights`
      - Remove all weird `bnb_linear` and `load_weights` and
      `post_load_weights`.
      
      New code layout:
      
      - New class `Weights` in charge of handling loading the weights from
      multiple files into appropiate tensors (potentially sharded)
      - TP layers now are "shells", they contain the code to know what kind of
      sharding we need + eventual `all_reduce`. They do not inherit from
      linear, but they contain some kind of Linear instead
      - the contained linear can be either FastLinear, BnbLinear or GPTq
      Linear next.
      - All modeling code is explictly made for sharding, process group is
      just no-ops for non sharded code (removes a lot of test cases)
      
      ![Screenshot from 2023-05-19
      23-19-59](https://github.com/huggingface/text-generation-inference/assets/204321/9a802654-74a3-488c-87a8-073743a6143f)
      
      ---------
      
      Co-authored-by: Ubuntu <ubuntu@ip-1...
      abd58ff8
  4. 05 Jun, 2023 2 commits
  5. 02 Jun, 2023 3 commits
  6. 01 Jun, 2023 4 commits
  7. 31 May, 2023 4 commits
  8. 30 May, 2023 5 commits
  9. 26 May, 2023 2 commits
  10. 25 May, 2023 1 commit
  11. 24 May, 2023 1 commit
  12. 23 May, 2023 9 commits
  13. 22 May, 2023 3 commits
  14. 16 May, 2023 2 commits
  15. 15 May, 2023 1 commit