"vscode:/vscode.git/clone" did not exist on "edc95a0e7dc4c8e13a20fbf2c8e2b99a8c3e4549"
  1. 25 Jun, 2024 1 commit
    • Daniël de Kok's avatar
      Add support for Marlin 2:4 sparsity (#2102) · f1f98e36
      Daniël de Kok authored
      This change adds support for 2:4 sparsity when using Marlin
      quantization. The 2:4 kernel is used when:
      
      * The quantizer is `marlin`;
      * the quantizer checkpoint format is `marlin_24`.
      
      Fixes #2098.
      f1f98e36
  2. 14 Jun, 2024 1 commit
    • Daniël de Kok's avatar
      Add support for GPTQ Marlin (#2052) · 093a27c5
      Daniël de Kok authored
      Add support for GPTQ Marlin kernels
      
      GPTQ Marlin extends the Marlin kernels to support common GPTQ
      configurations:
      
      - bits: 4 or 8
      - groupsize: -1, 32, 64, or 128
      - desc_act: true/false
      
      Using the GPTQ Marlin kernels requires repacking the parameters in the
      Marlin quantizer format.
      
      The kernels were contributed by Neural Magic to VLLM. We vendor them
      here for convenience.
      093a27c5