Unverified Commit c96f8863 authored by Rick Ho's avatar Rick Ho Committed by GitHub
Browse files

Merge pull request #42 from laekov/v0.2.0-pre-release

Checkout version number to V0.2.0
parents d205aaeb 411e57f5
FastMoE currently works with both `v2.0` and `v2.1` release of FastMoE works with different versions of
[Megatron-LM](https://github.com/nvidia/megatron-lm). [Megatron-LM](https://github.com/nvidia/megatron-lm).
See `fmoe/megatron/utils.py` for arguments of FastMoE.
Patches which you can find in this directory are used to easily enable MoE in An example patch is provided for `v2.2` release.
different versions of Megatron-LM for training Bert. The usage is the same in The patch can be directly applied to add FastMoE support if you are using
other training scripts. Megatron-LM v2.2.
Otherwise, you may need to manually enable FastMoE in your codebase.
The patch includes the following modifications.
The patch works in the following way. ### Add arguments to Megatron's argparser
In `megatron/arguments.py`, add `_add_fmoe_args` to the parser.
### Patch checkpoint
In `megatron/training.py`, replace `load_checkpoint` and `save_checkpoint` by
functions with the same name in `fmoe.megatron.checkpointing`.
### Building the model in FastMoE style ### Building the model in FastMoE style
In `pretrain_bert.py`, the `fmoe.megatron.fmoefy` function is used as an In `megatron/training.py`, the `fmoe.megatron.fmoefy` function is used as an
entrance to one-key introduce FastMoE layer to replace the MLP layers in the entrance to one-key introduce FastMoE layer to replace the MLP layers in the
transformer language models. transformer language models.
......
...@@ -6,6 +6,15 @@ import os ...@@ -6,6 +6,15 @@ import os
cxx_flags = [] cxx_flags = []
ext_libs = [] ext_libs = []
authors = [
'Jiaao He',
'Jiezhong Qiu',
'Aohan Zeng',
'Tiago Antunes',
'Jinjun Peng',
'Qin Li',
]
if os.environ.get('USE_NCCL', '0') == '1': if os.environ.get('USE_NCCL', '0') == '1':
cxx_flags.append('-DFMOE_USE_NCCL') cxx_flags.append('-DFMOE_USE_NCCL')
ext_libs.append('nccl') ext_libs.append('nccl')
...@@ -14,9 +23,9 @@ if os.environ.get('USE_NCCL', '0') == '1': ...@@ -14,9 +23,9 @@ if os.environ.get('USE_NCCL', '0') == '1':
if __name__ == '__main__': if __name__ == '__main__':
setuptools.setup( setuptools.setup(
name='fastmoe', name='fastmoe',
version='0.1.2', version='0.2.0',
description='An efficient Mixture-of-Experts system for PyTorch', description='An efficient Mixture-of-Experts system for PyTorch',
author='Jiaao He, Jiezhong Qiu and Aohan Zeng', author=', '.join(authors),
author_email='hja20@mails.tsinghua.edu.cn', author_email='hja20@mails.tsinghua.edu.cn',
license='Apache-2', license='Apache-2',
url='https://github.com/laekov/fastmoe', url='https://github.com/laekov/fastmoe',
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment