1. 02 May, 2024 1 commit
    • mobicham's avatar
      Add HQQ quantization support (#29637) · 59952994
      mobicham authored
      
      
      * update HQQ transformers integration
      
      * push import_utils.py
      
      * add force_hooks check in modeling_utils.py
      
      * fix | with Optional
      
      * force bias as param
      
      * check bias is Tensor
      
      * force forward for multi-gpu
      
      * review fixes pass
      
      * remove torch grad()
      
      * if any key in linear_tags fix
      
      * add cpu/disk check
      
      * isinstance return
      
      * add multigpu test + refactor tests
      
      * clean hqq_utils imports in hqq.py
      
      * clean hqq_utils imports in quantizer_hqq.py
      
      * delete hqq_utils.py
      
      * Delete src/transformers/utils/hqq_utils.py
      
      * ruff init
      
      * remove torch.float16 from __init__ in test
      
      * refactor test
      
      * isinstance -> type in quantizer_hqq.py
      
      * cpu/disk device_map check in quantizer_hqq.py
      
      * remove type(module) nn.linear check in quantizer_hqq.py
      
      * add BaseQuantizeConfig import inside HqqConfig init
      
      * remove hqq import in hqq.py
      
      * remove accelerate import from test_hqq.py
      
      * quant config.py doc update
      
      * add hqqconfig to main_classes doc
      
      * make style
      
      * __init__ fix
      
      * ruff __init__
      
      * skip_modules list
      
      * hqqconfig format fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * test_hqq.py remove mistral comment
      
      * remove self.using_multi_gpu is False
      
      * torch_dtype default val set and logger.info
      
      * hqq.py isinstance fix
      
      * remove torch=None
      
      * torch_device test_hqq
      
      * rename test_hqq
      
      * MODEL_ID in test_hqq
      
      * quantizer_hqq setattr fix
      
      * quantizer_hqq typo fix
      
      * imports quantizer_hqq.py
      
      * isinstance quantizer_hqq
      
      * hqq_layer.bias reformat quantizer_hqq
      
      * Step 2 as comment in quantizer_hqq
      
      * prepare_for_hqq_linear() comment
      
      * keep_in_fp32_modules fix
      
      * HqqHfQuantizer reformat
      
      * quantization.md hqqconfig
      
      * quantization.md model example reformat
      
      * quantization.md # space
      
      * quantization.md space   })
      
      * quantization.md space   })
      
      * quantization_config fix doc
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * axis value check in quantization_config
      
      * format
      
      * dynamic config explanation
      
      * quant config method in quantization.md
      
      * remove shard-level progress
      
      * .cuda fix modeling_utils
      
      * test_hqq fixes
      
      * make fix-copies
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      59952994
  2. 25 Apr, 2024 1 commit
  3. 22 Apr, 2024 1 commit
  4. 02 Apr, 2024 1 commit
  5. 15 Mar, 2024 1 commit
  6. 12 Mar, 2024 1 commit
  7. 06 Mar, 2024 1 commit
  8. 05 Mar, 2024 1 commit
  9. 16 Feb, 2024 1 commit
  10. 14 Feb, 2024 2 commits
  11. 05 Feb, 2024 1 commit
  12. 02 Feb, 2024 1 commit
  13. 01 Feb, 2024 1 commit
  14. 25 Jan, 2024 1 commit
  15. 24 Jan, 2024 1 commit
    • Steven Liu's avatar
      [docs] DeepSpeed (#28542) · 738ec75c
      Steven Liu authored
      * config
      
      * optim
      
      * pre deploy
      
      * deploy
      
      * save weights, memory, troubleshoot, non-Trainer
      
      * done
      738ec75c
  16. 12 Jan, 2024 1 commit
  17. 02 Jan, 2024 1 commit
  18. 20 Dec, 2023 1 commit
  19. 18 Dec, 2023 1 commit
  20. 15 Dec, 2023 2 commits
  21. 11 Dec, 2023 1 commit
  22. 28 Nov, 2023 1 commit
  23. 27 Nov, 2023 1 commit
  24. 24 Nov, 2023 2 commits
  25. 20 Nov, 2023 1 commit
  26. 13 Nov, 2023 1 commit
  27. 09 Nov, 2023 1 commit
  28. 06 Nov, 2023 2 commits
  29. 01 Nov, 2023 2 commits
  30. 31 Oct, 2023 2 commits
  31. 30 Oct, 2023 1 commit
  32. 27 Oct, 2023 1 commit
  33. 26 Oct, 2023 1 commit
    • Marc Sun's avatar
      add exllamav2 arg (#26437) · 8214d6e7
      Marc Sun authored
      * add_ xllamav2 arg
      
      * add test
      
      * style
      
      * add check
      
      * add doc
      
      * replace by use_exllama_v2
      
      * fix tests
      
      * fix doc
      
      * style
      
      * better condition
      
      * fix logic
      
      * add deprecate msg
      8214d6e7
  34. 25 Oct, 2023 1 commit