"megatron/git@developer.sourcefind.cn:OpenDAS/megatron-lm.git" did not exist on "cfe2c2be5d4e384b4efd9c2f2266edd68876b34e"
Commit 841bf60b authored by Hongkun Yu's avatar Hongkun Yu Committed by A. Unique TensorFlower
Browse files

Update readme to emphasize 'global batch size'.

PiperOrigin-RevId: 269376599
parent c21bec54
...@@ -96,6 +96,11 @@ tensorboard --logdir=$MODEL_DIR ...@@ -96,6 +96,11 @@ tensorboard --logdir=$MODEL_DIR
Users need to adjust `batch_size` and `num_gpus` to get good performance Users need to adjust `batch_size` and `num_gpus` to get good performance
running multiple GPUs. running multiple GPUs.
**Note that:**
when using multiple GPUs or TPUs, this is the global batch size for all
devices. For example, if the batch size is `4096*4` and there are 4 devices,
each device will take 4096 tokens as a batch budget.
Command to run: Command to run:
``` ```
python3 transformer_main.py --data_dir=$DATA_DIR --model_dir=$MODEL_DIR \ python3 transformer_main.py --data_dir=$DATA_DIR --model_dir=$MODEL_DIR \
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment