Unverified Commit d2768c22 authored by Graham King's avatar Graham King Committed by GitHub
Browse files

fix: Extract tokenizer from GGUF for Qwen3 and Gemma3 arch (#1011)

That avoids passing the `--model-config` param to dynamo-run when using llamacpp.
parent e9cb035a
...@@ -201,13 +201,12 @@ cargo build --features llamacpp[,cuda|metal|vulkan] -p dynamo-run ...@@ -201,13 +201,12 @@ cargo build --features llamacpp[,cuda|metal|vulkan] -p dynamo-run
``` ```
``` ```
dynamo-run out=llamacpp ~/llms/Llama-3.2-3B-Instruct-Q6_K.gguf dynamo-run out=llamacpp ~/llms/gemma-3-1b-it-q4_0.gguf
dynamo-run out=llamacpp ~/llms/Qwen3-0.6B-Q8_0.gguf # From https://huggingface.co/ggml-org
``` ```
Note that in some cases we are unable to extract the tokenizer from the GGUF, and so a Hugging Face checkout of a matching model must also be passed. Dynamo will use the weights from the GGUF and the pre-processor (`tokenizer.json`, etc) from the `--model-config`: Note that in some cases we are unable to extract the tokenizer from the GGUF, and so a Hugging Face checkout of a matching model must also be passed. Dynamo will use the weights from the GGUF and the pre-processor (`tokenizer.json`, etc) from the `--model-config`:
``` ```
dynamo-run out=llamacpp ~/llms/gemma-3-1b-it-q4_0.gguf --model-config ~/llms/gemma-3-1b-it
dynamo-run out=llamacpp ~/llms/Llama-4-Scout-17B-16E-Instruct-UD-IQ1_S.gguf --model-config ~/llms/Llama-4-Scout-17B-16E-Instruct dynamo-run out=llamacpp ~/llms/Llama-4-Scout-17B-16E-Instruct-UD-IQ1_S.gguf --model-config ~/llms/Llama-4-Scout-17B-16E-Instruct
``` ```
......
...@@ -56,6 +56,8 @@ pub enum GGUFArchitecture { ...@@ -56,6 +56,8 @@ pub enum GGUFArchitecture {
Phi3, Phi3,
Starcoder2, Starcoder2,
Qwen2, Qwen2,
Qwen3,
Gemma3,
} }
// Wraps from_str() for some convenience: // Wraps from_str() for some convenience:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment