Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
ColossalAI
Commits
c174c4fc
Unverified
Commit
c174c4fc
authored
Jan 11, 2024
by
binmakeswell
Committed by
GitHub
Jan 11, 2024
Browse files
[doc] fix doc typo (#5256)
* [doc] fix annotation display * [doc] fix llama2 doc
parent
e830ef91
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
14 additions
and
16 deletions
+14
-16
colossalai/shardformer/README.md
colossalai/shardformer/README.md
+13
-13
examples/language/llama2/README.md
examples/language/llama2/README.md
+1
-3
No files found.
colossalai/shardformer/README.md
View file @
c174c4fc
...
@@ -116,18 +116,18 @@ We will follow this roadmap to develop Shardformer:
...
@@ -116,18 +116,18 @@ We will follow this roadmap to develop Shardformer:
| model | tensor parallel | pipeline parallel | lazy initialization | xformer | flash attn2 | jit fused operator | fused layernorm | sequence parallel | overlap |
| model | tensor parallel | pipeline parallel | lazy initialization | xformer | flash attn2 | jit fused operator | fused layernorm | sequence parallel | overlap |
| :------: | :-----: | :-----: | :--------: | :---------: | :------: | :-----: | :-----: | :--------: | :---------: |
| :------: | :-----: | :-----: | :--------: | :---------: | :------: | :-----: | :-----: | :--------: | :---------: |
| bert | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] |
| bert | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] |
| t5 | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [ ] | [ ] |
| t5 | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [ ] | [ ] |
| llama V1/V2 | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [ ] | [ ] |
| llama V1/V2 | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [ ] | [ ] |
| gpt2 | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] |
| gpt2 | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] |
| opt | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [ ] | [ ] |
| opt | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [ ] | [ ] |
| bloom | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] |
| bloom | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] |
| chatglm2 | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] |
| chatglm2 | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] |
| vit | [
x
] | [
x
] | [ ] | [
x
] | [
x
] | [
x
] | [
x
] | [ ] | [ ] |
| vit | [
√
] | [
√
] | [ ] | [
√
] | [
√
] | [
√
] | [
√
] | [ ] | [ ] |
| whisper | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [ ] | [
x
] | [ ] | [ ] |
| whisper | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [ ] | [
√
] | [ ] | [ ] |
| sam | [
x
] | [ ] | [ ] | [
x
] | [
x
] | [
x
] | [
x
] | [ ] | [ ] |
| sam | [
√
] | [ ] | [ ] | [
√
] | [
√
] | [
√
] | [
√
] | [ ] | [ ] |
| blip2 | [
x
] | [ ] | [ ] | [
x
] | [
x
] | [
x
] | [
x
] | [ ] | [ ] |
| blip2 | [
√
] | [ ] | [ ] | [
√
] | [
√
] | [
√
] | [
√
] | [ ] | [ ] |
| falcon | [
x
] | [
x
] | [
x
] | [
x
] | [
x
] | [ ] | [
x
] | [ ] | [ ] |
| falcon | [
√
] | [
√
] | [
√
] | [
√
] | [
√
] | [ ] | [
√
] | [ ] | [ ] |
| roberta | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] |
| roberta | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] |
| albert | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] |
| albert | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] |
| ernie | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] |
| ernie | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] |
...
@@ -137,7 +137,7 @@ We will follow this roadmap to develop Shardformer:
...
@@ -137,7 +137,7 @@ We will follow this roadmap to develop Shardformer:
| swin | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] |
| swin | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] |
| swin V2 | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] |
| swin V2 | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] |
| qwen | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] |
| qwen | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] | [ ] |
| mistral | [
x
] | [ ] | [ ] | [
x
] | [
x
] | [
x
] | [
x
] | [ ] | [ ] |
| mistral | [
√
] | [ ] | [ ] | [
√
] | [
√
] | [
√
] | [
√
] | [ ] | [ ] |
## 💡 API Design
## 💡 API Design
...
...
examples/language/llama2/README.md
View file @
c174c4fc
...
@@ -6,7 +6,6 @@
...
@@ -6,7 +6,6 @@
</p>
</p>
-
70 billion parameter LLaMA2 model training accelerated by 195%
-
70 billion parameter LLaMA2 model training accelerated by 195%
[
[code]
](
https://github.com/hpcaitech/ColossalAI/tree/main/examples/language/llama2
)
[
[blog]
](
https://www.hpc-ai.tech/blog/70b-llama2-training
)
[
[blog]
](
https://www.hpc-ai.tech/blog/70b-llama2-training
)
### LLaMA1
### LLaMA1
...
@@ -15,7 +14,6 @@
...
@@ -15,7 +14,6 @@
</p>
</p>
-
65-billion-parameter large model pretraining accelerated by 38%
-
65-billion-parameter large model pretraining accelerated by 38%
[
[code]
](
https://github.com/hpcaitech/ColossalAI/tree/example/llama/examples/language/llama
)
[
[blog]
](
https://www.hpc-ai.tech/blog/large-model-pretraining
)
[
[blog]
](
https://www.hpc-ai.tech/blog/large-model-pretraining
)
## Dataset
## Dataset
...
@@ -123,7 +121,7 @@ Here we will show an example of how to run training
...
@@ -123,7 +121,7 @@ Here we will show an example of how to run training
llama pretraining with
`gemini, batch_size=16, sequence_length=4096, gradient_checkpoint=True, flash_attn=True`
.
llama pretraining with
`gemini, batch_size=16, sequence_length=4096, gradient_checkpoint=True, flash_attn=True`
.
#### a. Running environment
#### a. Running environment
This experiment was performed on 4 computing nodes with 32 A800 GPUs in total for LLaMA-1 65B. The nodes are
This experiment was performed on 4 computing nodes with 32 A800
/H800 80GB
GPUs in total for LLaMA-1 65B
or LLaMA-2 70B
. The nodes are
connected with RDMA and GPUs within one node are fully connected with NVLink.
connected with RDMA and GPUs within one node are fully connected with NVLink.
#### b. Running command
#### b. Running command
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment