Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
evt_fugx1
dcu_megatron
Commits
2b8d28d0
"projects/vscode:/vscode.git/clone" did not exist on "9751f1f1855a06da24ce10a421f8e956c3e5b62f"
Commit
2b8d28d0
authored
Mar 26, 2025
by
dongcl
Browse files
增加readme
parent
5d997698
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
73 additions
and
1 deletion
+73
-1
README.md
README.md
+73
-0
dcu_megatron/adaptor/megatron_adaptor.py
dcu_megatron/adaptor/megatron_adaptor.py
+0
-1
No files found.
README.md
View file @
2b8d28d0
# Dcu Megatron
本项目通过替换megatron的函数或类,引入新的特性或者实现更好的性能。替换的函数或类注册在dcu_megatron/adaptor/megatron_adaptor.py。
+
支持函数替换
```
from ..core.distributed.finalize_model_grads import _allreduce_word_embedding_grads
MegatronAdaptation.register('megatron.core.distributed.finalize_model_grads._allreduce_word_embedding_grads',
_allreduce_word_embedding_grads)
```
以上代码将megatron的_allreduce_word_embedding_grads替换为自定义的_allreduce_word_embedding_grads。
+
支持类替换
```
from ..core.transformer.transformer_config import TransformerConfig, MLATransformerConfig
# Transformer config
MegatronAdaptation.register('megatron.core.transformer.transformer_config.TransformerConfig',
TransformerConfig)
MegatronAdaptation.register('megatron.core.transformer.transformer_config.MLATransformerConfig',
MLATransformerConfig)
```
以上代码将megatron的TransformerConfig和MLATransformerConfig替换为自定义类型。
+
支持基类替换
```
from megatron.core.extensions.transformer_engine import TEGroupedLinear
if int(os.getenv("GROUPED_GEMM_BatchLinear", '0')):
TEGroupedLinear.__bases__ = (te.pytorch.BatchLinear,)
```
以上代码将TEGroupedLinear的父类替换为te.pytorch.BatchLinear。
+
支持增加修饰器
```
MegatronAdaptation.register('megatron.core.transformer.moe.moe_utils.permute',
torch.compile(mode='max-autotune-no-cudagraphs'),
apply_wrapper=True)
MegatronAdaptation.register('megatron.core.transformer.moe.moe_utils.unpermute',
torch.compile(mode='max-autotune-no-cudagraphs'),
apply_wrapper=True)
```
以上代码对permute和unpermute函数增加修饰器,效果如下:
```
@torch.compile(mode='max-autotune-no-cudagraphs')
def permute(
tokens,
routing_map,
num_out_tokens: Optional[int] = None,
fused: bool = False,
drop_and_pad: bool = False,
):
@torch.compile(mode='max-autotune-no-cudagraphs')
def unpermute(
permuted_tokens: torch.Tensor,
sorted_indices: torch.Tensor,
restore_shape: torch.Size,
probs: torch.Tensor = None,
routing_map: torch.Tensor = None,
fused: bool = False,
drop_and_pad: bool = False,
):
```
### 使用方式
在使用时,需要安装megatron,或者将megatron放到dcu_megatron同一级目录下
project/
├── dcu_megatron
├── megatron
└── pretrain_gpt.py
dcu_megatron/adaptor/megatron_adaptor.py
View file @
2b8d28d0
...
...
@@ -120,7 +120,6 @@ class CoreAdaptation(MegatronAdaptationABC):
# Transformer config
MegatronAdaptation
.
register
(
'megatron.core.transformer.transformer_config.TransformerConfig'
,
TransformerConfig
)
# Transformer config
MegatronAdaptation
.
register
(
'megatron.core.transformer.transformer_config.MLATransformerConfig'
,
MLATransformerConfig
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment