1. 31 Jul, 2024 1 commit
    • Daniël de Kok's avatar
      Handle GPTQ-Marlin loading in `GPTQMarlinWeightLoader` (#2300) · 34f7dcfd
      Daniël de Kok authored
      The `GPTWeightLoader` was structured like this in pseudocode:
      
      if marlin:
        Set up tensors in a way that GPTQ-Marlin expects
      else:
        Set up tensors in a way that ExLlama/GPTQ/AWQ expect
      
      However, the GPT-Marlin implementation details should really be in the
      `marlin` module. So move the former part out to a separate
      `GPTQMarlinWeightsLoader`.
      34f7dcfd
  2. 29 Jul, 2024 1 commit
  3. 26 Jul, 2024 1 commit
    • drbh's avatar
      feat: add ruff and resolve issue (#2262) · bab02ff2
      drbh authored
      * feat: add ruff and resolve issue
      
      * fix: update client exports and adjust after rebase
      
      * fix: adjust syntax to avoid circular import
      
      * fix: adjust client ruff settings
      
      * fix: lint and refactor import check and avoid model enum as global names
      
      * fix: improve fbgemm_gpu check and lints
      
      * fix: update lints
      
      * fix: prefer comparing model enum over str
      
      * fix: adjust lints and ignore specific rules
      
      * fix: avoid unneeded quantize check
      bab02ff2
  4. 24 Jul, 2024 1 commit