Commit ae658b89 authored by Rick Ho's avatar Rick Ho
Browse files

add megatron example

parent f9bec836
A modified version of Megatron-LM that can cope with FastMoE can be found in
[this repository](https://github.com/laekov/fmoe-megatron).
Using `fmoe.megatron.create_moe_mlp` to replace the `ParallelMLP` module in
Megatron's transformer model is all you need.
In our fork, the required modifications are located at line 425 of
`megatron/model/transformer.py` as follow.
```Python
# MLP
if args.num_experts == 1:
self.mlp = ParallelMLP(init_method,
output_layer_init_method)
else:
from fmoe.megatron import create_moe_mlp
self.mlp = create_moe_mlp(args)
```
When properly added `--num-experts` argument to `megatron/arguments.py`, FastMoE
is enabled without extra burden.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment