Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
FastMoE
Commits
c96f8863
Unverified
Commit
c96f8863
authored
May 31, 2021
by
Rick Ho
Committed by
GitHub
May 31, 2021
Browse files
Merge pull request #42 from laekov/v0.2.0-pre-release
Checkout version number to V0.2.0
parents
d205aaeb
411e57f5
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
27 additions
and
8 deletions
+27
-8
examples/megatron/README.md
examples/megatron/README.md
+16
-6
setup.py
setup.py
+11
-2
No files found.
examples/megatron/README.md
View file @
c96f8863
FastMoE
currently
works with
both
`v2.0`
and
`v2.1`
release
of
FastMoE works with
different versions
of
[
Megatron-LM
](
https://github.com/nvidia/megatron-lm
)
.
[
Megatron-LM
](
https://github.com/nvidia/megatron-lm
)
.
See
`fmoe/megatron/utils.py`
for arguments of FastMoE.
Patches which you can find in this directory are used to easily enable MoE in
An example patch is provided for
`v2.2`
release.
different versions of Megatron-LM for training Bert. The usage is the same in
The patch can be directly applied to add FastMoE support if you are using
other training scripts.
Megatron-LM v2.2.
Otherwise, you may need to manually enable FastMoE in your codebase.
The patch includes the following modifications.
The patch works in the following way.
### Add arguments to Megatron's argparser
In
`megatron/arguments.py`
, add
`_add_fmoe_args`
to the parser.
### Patch checkpoint
In
`megatron/training.py`
, replace
`load_checkpoint`
and
`save_checkpoint`
by
functions with the same name in
`fmoe.megatron.checkpointing`
.
### Building the model in FastMoE style
### Building the model in FastMoE style
In
`
pretrain_bert
.py`
, the
`fmoe.megatron.fmoefy`
function is used as an
In
`
megatron/training
.py`
, the
`fmoe.megatron.fmoefy`
function is used as an
entrance to one-key introduce FastMoE layer to replace the MLP layers in the
entrance to one-key introduce FastMoE layer to replace the MLP layers in the
transformer language models.
transformer language models.
...
...
setup.py
View file @
c96f8863
...
@@ -6,6 +6,15 @@ import os
...
@@ -6,6 +6,15 @@ import os
cxx_flags
=
[]
cxx_flags
=
[]
ext_libs
=
[]
ext_libs
=
[]
authors
=
[
'Jiaao He'
,
'Jiezhong Qiu'
,
'Aohan Zeng'
,
'Tiago Antunes'
,
'Jinjun Peng'
,
'Qin Li'
,
]
if
os
.
environ
.
get
(
'USE_NCCL'
,
'0'
)
==
'1'
:
if
os
.
environ
.
get
(
'USE_NCCL'
,
'0'
)
==
'1'
:
cxx_flags
.
append
(
'-DFMOE_USE_NCCL'
)
cxx_flags
.
append
(
'-DFMOE_USE_NCCL'
)
ext_libs
.
append
(
'nccl'
)
ext_libs
.
append
(
'nccl'
)
...
@@ -14,9 +23,9 @@ if os.environ.get('USE_NCCL', '0') == '1':
...
@@ -14,9 +23,9 @@ if os.environ.get('USE_NCCL', '0') == '1':
if
__name__
==
'__main__'
:
if
__name__
==
'__main__'
:
setuptools
.
setup
(
setuptools
.
setup
(
name
=
'fastmoe'
,
name
=
'fastmoe'
,
version
=
'0.
1.2
'
,
version
=
'0.
2.0
'
,
description
=
'An efficient Mixture-of-Experts system for PyTorch'
,
description
=
'An efficient Mixture-of-Experts system for PyTorch'
,
author
=
'
Jiaao He, Jiezhong Qiu and Aohan Zeng'
,
author
=
'
, '
.
join
(
authors
)
,
author_email
=
'hja20@mails.tsinghua.edu.cn'
,
author_email
=
'hja20@mails.tsinghua.edu.cn'
,
license
=
'Apache-2'
,
license
=
'Apache-2'
,
url
=
'https://github.com/laekov/fastmoe'
,
url
=
'https://github.com/laekov/fastmoe'
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment