"docs/git@developer.sourcefind.cn:change/sglang.git" did not exist on "463d56bf4439b078748c47421dffaf73d8eaede4"
Multi-worker support for Resnet. (#6206)
* Update official resnet for multi worker training with distribution strategies. * Fixes for multi worker training. * Fix call to `get_distribution_strategy`. * Undo test change. * Fix spacing. * Move cluster configuration to distribution_utils. * Move train_and_evaluate out of loop. Also, update docstrings for multi-worker flags and add use_train_and_evaluate flag. * Update distribution_strategy flag to match exported name for collective strategy.
Showing
Please register or sign in to comment