Performance of Quantized Models ================================== .. attention:: To be updated for Qwen3. This section reports the generation performance of quantized models (including GPTQ and AWQ) of the Qwen2 series. Specifically, we report: * MMLU (Accuracy) * C-Eval (Accuracy) * IFEval (Strict Prompt-Level Accuracy) We use greedy decoding in evaluating all models. +---------------------+--------------+---------+-------+--------+--------+ | | Quantization | Average | MMLU | C-Eval | IFEval | +=====================+==============+=========+=======+========+========+ | Qwen2-72B-Instruct | BF16 | 81.3 | 82.3 | 83.8 | 77.6 | + +--------------+---------+-------+--------+--------+ | | GPTQ-Int8 | 80.7 | 81.3 | 83.4 | 77.5 | + +--------------+---------+-------+--------+--------+ | | GPTQ-Int4 | 81.2 | 80.8 | 83.9 | 78.9 | + +--------------+---------+-------+--------+--------+ | | AWQ | 80.4 | 80.5 | 83.9 | 76.9 | +---------------------+--------------+---------+-------+--------+--------+ | Qwen2-7B-Instruct | BF16 | 66.9 | 70.5 | 77.2 | 53.1 | + +--------------+---------+-------+--------+--------+ | | GPTQ-Int8 | 66.2 | 69.1 | 76.7 | 52.9 | + +--------------+---------+-------+--------+--------+ | | GPTQ-Int4 | 64.1 | 67.8 | 75.2 | 49.4 | + +--------------+---------+-------+--------+--------+ | | AWQ | 64.1 | 67.4 | 73.6 | 51.4 | +---------------------+--------------+---------+-------+--------+--------+ | Qwen2-1.5B-Instruct | BF16 | 48.4 | 52.4 | 63.8 | 29.0 | + +--------------+---------+-------+--------+--------+ | | GPTQ-Int8 | 48.1 | 53.0 | 62.5 | 28.8 | + +--------------+---------+-------+--------+--------+ | | GPTQ-Int4 | 45.0 | 50.7 | 57.4 | 27.0 | + +--------------+---------+-------+--------+--------+ | | AWQ | 46.5 | 51.6 | 58.1 | 29.9 | +---------------------+--------------+---------+-------+--------+--------+ | Qwen2-0.5B-Instruct | BF16 | 34.4 | 37.9 | 45.2 | 20.0 | + +--------------+---------+-------+--------+--------+ | | GPTQ-Int8 | 32.6 | 35.6 | 43.9 | 18.1 | + +--------------+---------+-------+--------+--------+ | | GPTQ-Int4 | 29.7 | 33.0 | 39.2 | 16.8 | + +--------------+---------+-------+--------+--------+ | | AWQ | 31.1 | 34.4 | 42.1 | 16.7 | +---------------------+--------------+---------+-------+--------+--------+