- 27 Feb, 2020 3 commits
-
-
Jeff Rasley authored
* add text about mpirun
-
Jeff Rasley authored
* add mpirun support for openmpi 4.0 * add master addr support from args * switch mpi detection to use mpi4py * set constant for default distributed port * Make sure deepspeed_mpi exits in args
-
Jeff Rasley authored
-
- 26 Feb, 2020 1 commit
-
-
Jeff Rasley authored
* add auto-detect to torch dist init * update tests to infer distributed init status * prevent crash if dist_init_required is True but already initiliazed * only init if safe to do so (forgot to add this file in prev commit)
-
- 25 Feb, 2020 2 commits
-
-
zenlytix authored
* Update scripts to handle cases where you have other VMs in your sub * Support subs with other VMs and fix for PDSH permission error * Minor fix to support subs with other VMs * Added shutdown with or without delete VM option In Azure deallocate is like machine shutdown (and prevents billing). You can restart deallocated VM. To fully drop the VM delete is used. This command with "-d" option will fully delete the VM. Without any argument it justs deallocates / shutd down the VM.
-
zenlytix authored
* Update scripts to handle cases where you have other VMs in your sub * Support subs with other VMs and fix for PDSH permission error * Minor fix to support subs with other VMs
-
- 24 Feb, 2020 3 commits
-
-
Jeff Rasley authored
-
Shaden Smith authored
-
Shaden Smith authored
* Removes DeepSpeedDataSource * dropping unused imports Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
- 22 Feb, 2020 1 commit
-
-
Olatunji Ruwase authored
* Support legacy optimizer fusion as config option * Configure for legacy optimizer fusion * Update configuration jsons for new apex
-
- 20 Feb, 2020 3 commits
-
-
Jeff Rasley authored
Also a fix for #94
-
Jeff Rasley authored
Co-authored-by:Shaden Smith <ShadenTSmith@gmail.com>
-
Shaden Smith authored
-
- 15 Feb, 2020 2 commits
-
-
kouml authored
* add install requirements command line * add pillow library to fix version * modify to uppercase Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
Jeff Rasley authored
bug fixes for adamw/lamb and corresponding tests
-
- 14 Feb, 2020 2 commits
-
-
Shaden Smith authored
* Porting BingBertSquad test * Updating default paths. * Enable model tests. * Updating DeepSpeedExamples submodule * Adding BingBertSquad's log uploads. * Messed up the submodule again :-)
-
Jeff Rasley authored
* Set up CI with Azure Pipelines for docker build/push
-
- 13 Feb, 2020 2 commits
-
-
Rahul Prasad authored
-
Jeff Rasley authored
* bump tf version in dockerfile * Update install.sh
-
- 12 Feb, 2020 2 commits
-
-
Shaden Smith authored
-
eltonzheng authored
-
- 11 Feb, 2020 1 commit
-
-
Gaurav Menghani authored
* Fix broken link for the 1Cycle doc. * Removed the 1Cycle link from README.md.
-
- 10 Feb, 2020 11 commits
-
-
Shaden Smith authored
-
Shaden Smith authored
-
Shaden Smith authored
* Importing 1Cycle tutorial. * image paths * Added LR schedule figure * line wrap * lowercase name * Updating README links * typo
-
sheikheddy authored
-
Jeff Rasley authored
-
Shaden Smith authored
Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
Shaden Smith authored
-
Jeff Rasley authored
-
Shaden Smith authored
* Increasing section headers * Move testing under contributing Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
Shaden Smith authored
Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
Shaden Smith authored
-
- 09 Feb, 2020 4 commits
-
-
Shaden Smith authored
-
Jeff Rasley authored
-
dependabot[bot] authored
Bumps [tensorflow-gpu](https://github.com/tensorflow/tensorflow) from 1.14.0 to 1.15.2. - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md) - [Commits](https://github.com/tensorflow/tensorflow/compare/v1.14.0...v1.15.2 ) Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
Shaden Smith <ShadenTSmith@gmail.com> Co-authored-by:
Jeff Rasley <jerasley@microsoft.com>
-
Jeff Rasley authored
-
- 08 Feb, 2020 3 commits
-
-
Shaden Smith authored
-
Jeff Rasley authored
Add Azure tutorial text and scripts
-
Shaden Smith authored
-