[examples] polish AutoParallel readme (#3270)

fd6add57 · YuliangLiu0306 · GitHub · 02b05803 · fd6add57 · fd6add57
Unverified Commit fd6add57 authored Mar 28, 2023 by YuliangLiu0306 Committed by GitHub Mar 28, 2023
Showing with 3 additions and 2 deletions

examples/tutorial/auto_parallel/README.md examples/tutorial/auto_parallel/README.md +1 -0

examples/tutorial/auto_parallel/requirements.txt examples/tutorial/auto_parallel/requirements.txt +2 -2

No files found.
--- a/examples/tutorial/auto_parallel/README.md
+++ b/examples/tutorial/auto_parallel/README.md
@@ -45,6 +45,7 @@ colossalai run --nproc_per_node 4 auto_parallel_with_resnet.py
 You should expect to the log like this. This log shows the edge cost on the computation graph as well as the sharding strategy for an operation. For example, `layer1_0_conv1 S01R = S01R X RR` means that the first dimension (batch) of the input and output is sharded while the weight is not sharded (S means sharded, R means replicated), simply equivalent to data parallel training.
 ![](https://raw.githubusercontent.com/hpcaitech/public_assets/main/examples/tutorial/auto-parallel%20demo.png)

+**Note: This experimental feature has been tested on torch 1.12.1 and transformer 4.22.2. If you are using other versions, you may need to modify the code to make it work.**

 ### Auto-Checkpoint Tutorial


--- a/examples/tutorial/auto_parallel/requirements.txt
+++ b/examples/tutorial/auto_parallel/requirements.txt
-torch
+torch==1.12.1
 colossalai
 titans
 pulp
 datasets
 matplotlib
-transformers
+transformers==4.22.1