Unverified Commit dc003c30 authored by Xuanlei Zhao's avatar Xuanlei Zhao Committed by GitHub
Browse files

[moe] merge moe into main (#4978)

* update moe module
* support openmoe
parent 8993c8a8
python convert_openmoe_ckpt.py --t5x_checkpoint_path /path/to/t5x --config_file /path/to/config --pytorch_dump_path /path/to/save
This diff is collapsed.
{
"architectures": [
"OpenMoeForCausalLM"
],
"intermediate_size": 8192,
"hidden_size": 2048,
"num_hidden_layers": 24,
"head_dim": 128,
"num_attention_heads": 24,
"dropout_rate": 0.0,
"layer_norm_epsilon": 1e-06,
"vocab_size": 256384,
"hidden_act": "swiglu",
"num_experts": 32,
"topk": 2,
"capacity_factor_train": 1.25,
"capacity_factor_eval": 2.0,
"min_capacity": 4,
"noisy_policy": null,
"drop_tks": true,
"expert_parallel": null,
"gated": true,
"moe_layer_interval": 6
}
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment