"...git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "9832ac7c736519fcfeedb88c8368cf0ab08b2b58"
Unverified Commit d269c4b2 authored by Younes Belkada's avatar Younes Belkada Committed by GitHub
Browse files

[`Mixtral`] update conversion script to reflect new changes (#28068)



* Update convert_mixtral_weights_to_hf.py

* forward contrib credits from original fix

---------
Co-authored-by: default avatarthomasw21 <thomasw21@users.noreply.github.com>
parent 70a127a3
...@@ -65,7 +65,7 @@ def write_model(model_path, input_base_path, model_size, safe_serialization=True ...@@ -65,7 +65,7 @@ def write_model(model_path, input_base_path, model_size, safe_serialization=True
num_shards = 1 num_shards = 1
# For some reason this is a string in the params.json # For some reason this is a string in the params.json
sliding_window = int(params["sliding_window"]) sliding_window = int(params["sliding_window"]) if "sliding_window" in params else None
n_layers = params["num_hidden_layers"] n_layers = params["num_hidden_layers"]
n_heads = params["num_attention_heads"] n_heads = params["num_attention_heads"]
n_heads_per_shard = n_heads // num_shards n_heads_per_shard = n_heads // num_shards
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment