Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
text-generation-inference
Commits
abd24dd3
Unverified
Commit
abd24dd3
authored
Sep 19, 2024
by
Daniël de Kok
Committed by
GitHub
Sep 19, 2024
Browse files
doc: clarify that `--quantize` is not needed for pre-quantized models (#2536)
parent
c1037601
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
9 additions
and
2 deletions
+9
-2
docs/source/reference/launcher.md
docs/source/reference/launcher.md
+3
-1
flake.nix
flake.nix
+1
-0
launcher/src/main.rs
launcher/src/main.rs
+5
-1
No files found.
docs/source/reference/launcher.md
View file @
abd24dd3
...
...
@@ -55,7 +55,9 @@ Options:
## QUANTIZE
```
shell
--quantize
<QUANTIZE>
Whether you want the model to be quantized
Quantization method to use
for
the model. It is not necessary to specify this option
for
pre-quantized models, since the quantization method is
read
from the model configuration.
Marlin kernels will be used automatically
for
GPTQ/AWQ models.
[
env
:
QUANTIZE
=]
...
...
flake.nix
View file @
abd24dd3
...
...
@@ -157,6 +157,7 @@
pyright
pytest
pytest-asyncio
redocly
ruff
syrupy
]);
...
...
launcher/src/main.rs
View file @
abd24dd3
...
...
@@ -367,7 +367,11 @@ struct Args {
#[clap(long,
env)]
num_shard
:
Option
<
usize
>
,
/// Whether you want the model to be quantized.
/// Quantization method to use for the model. It is not necessary to specify this option
/// for pre-quantized models, since the quantization method is read from the model
/// configuration.
///
/// Marlin kernels will be used automatically for GPTQ/AWQ models.
#[clap(long,
env,
value_enum)]
quantize
:
Option
<
Quantization
>
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment