Unverified Commit abd24dd3 authored by Daniël de Kok's avatar Daniël de Kok Committed by GitHub
Browse files

doc: clarify that `--quantize` is not needed for pre-quantized models (#2536)

parent c1037601
...@@ -55,7 +55,9 @@ Options: ...@@ -55,7 +55,9 @@ Options:
## QUANTIZE ## QUANTIZE
```shell ```shell
--quantize <QUANTIZE> --quantize <QUANTIZE>
Whether you want the model to be quantized Quantization method to use for the model. It is not necessary to specify this option for pre-quantized models, since the quantization method is read from the model configuration.
Marlin kernels will be used automatically for GPTQ/AWQ models.
[env: QUANTIZE=] [env: QUANTIZE=]
......
...@@ -157,6 +157,7 @@ ...@@ -157,6 +157,7 @@
pyright pyright
pytest pytest
pytest-asyncio pytest-asyncio
redocly
ruff ruff
syrupy syrupy
]); ]);
......
...@@ -367,7 +367,11 @@ struct Args { ...@@ -367,7 +367,11 @@ struct Args {
#[clap(long, env)] #[clap(long, env)]
num_shard: Option<usize>, num_shard: Option<usize>,
/// Whether you want the model to be quantized. /// Quantization method to use for the model. It is not necessary to specify this option
/// for pre-quantized models, since the quantization method is read from the model
/// configuration.
///
/// Marlin kernels will be used automatically for GPTQ/AWQ models.
#[clap(long, env, value_enum)] #[clap(long, env, value_enum)]
quantize: Option<Quantization>, quantize: Option<Quantization>,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment