[PyTorch] Update FSDP example instructions (#1719)

Update FSDP example instructions Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>

[PyTorch] Update FSDP example instructions (#1719)
Update FSDP example instructions Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
6a969f0e · Kirthi Shankar Sivamani · GitHub · 62d1b2bd · 6a969f0e
Unverified Commit 6a969f0e authored Apr 25, 2025 by Kirthi Shankar Sivamani Committed by GitHub Apr 25, 2025
Show whitespace changes
Inline Side-by-side

Showing with 2 additions and 2 deletions

examples/pytorch/fsdp/README.md examples/pytorch/fsdp/README.md +2 -2

No files found.
--- a/examples/pytorch/fsdp/README.md
+++ b/examples/pytorch/fsdp/README.md
@@ -8,7 +8,7 @@
 # FSDP without deferred initialization:
 #     Duplicate modules initialized on each device. Load on device memory reduced only after
 #     torch.distributed.fsdp.FullyShardedDataParallel mode shards model parameters.
-$ torchrun --standalone --nnodes=1 --nproc-per-node=$(nvidia-smi -L | wc -l) fsdp.py
+$ torchrun --standalone --nnodes=1 --nproc-per-node=$(nvidia-smi -L | wc -l) fsdp.py --no-defer-init
 # Sample output on 8xL40S:
 #    [GPU-0] WORLD_SIZE = 8
 #    [GPU-0] TransformerEngine Model:
@@ -40,7 +40,7 @@ $ torchrun --standalone --nnodes=1 --nproc-per-node=$(nvidia-smi -L | wc -l) fsd
 #    Modules initialized with empty parameters via `device='meta'` option. Zero load on device
 #    memory until torch.distributed.fsdp.FullyShardedDataParallel mode triggers a reset on
 #    on already sharded model parameters.
-$ torchrun --standalone --nnodes=1 --nproc-per-node=$(nvidia-smi -L | wc -l) fsdp.py --defer-init
+$ torchrun --standalone --nnodes=1 --nproc-per-node=$(nvidia-smi -L | wc -l) fsdp.py
 # Sample output on 8xL40S:
 #    [GPU-0] WORLD_SIZE = 8
 #    ...