• Daniël de Kok's avatar
    Use symmetric quantization in the `quantize` subcommand (#2120) · dbb23fbf
    Daniël de Kok authored
    Packing of asymmetric quantization is broken, all (q)zeros values
    of `0` get reset to `1`, resulting in a loss of accuracy. So instead
    use symmetric quantization. To be able to distinguish models with
    symmetric and asymmetric quantization, a new config tensor `gptq_sym` is
    added. If this tensor is not present, we assume `sym=False`.
    dbb23fbf
weights.py 12.1 KB