[Doc] Add note to `gte-Qwen2` models (#11808)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

[Doc] Add note to `gte-Qwen2` models (#11808)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
c0efe92d · Cyrus Leung · GitHub · d9fa1c05 · c0efe92d
Unverified Commit c0efe92d authored Jan 07, 2025 by Cyrus Leung Committed by GitHub Jan 07, 2025
Show whitespace changes
Inline Side-by-side

Showing with 3 additions and 0 deletions

docs/source/models/supported_models.md docs/source/models/supported_models.md +3 -0

No files found.
--- a/docs/source/models/supported_models.md
+++ b/docs/source/models/supported_models.md
@@ -430,6 +430,9 @@ You can set `--hf-overrides '{"is_causal": false}'` to change the attention mask

 On the other hand, its 1.5B variant (`Alibaba-NLP/gte-Qwen2-1.5B-instruct`) uses causal attention
 despite being described otherwise on its model card.
+
+Regardless of the variant, you need to enable `--trust-remote-code` for the correct tokenizer to be
+loaded. See [relevant issue on HF Transformers](https://github.com/huggingface/transformers/issues/34882).
 ```

 If your model is not in the above list, we will try to automatically convert the model using