"common/vscode:/vscode.git/clone" did not exist on "d5e51aa9f0e354f1acb5f42c88890ff899ed832b"
Commit 557b84d8 authored by dongcl's avatar dongcl
Browse files

增加路由器参数

parent f967a24c
Pipeline #2492 passed with stage
......@@ -59,6 +59,8 @@ def compute_weight_and_optimizer_memory(args, verbose=False):
)
# routed experts.
+ (2 * (args.ffn_hidden_size * args.hidden_size) * num_experts * gated_linear_multiplier)
# router
+ args.hidden_size * num_experts
# shared experts.
+ (2 * args.moe_shared_expert_intermediate_size * args.hidden_size)
# Transformer layernorms.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment