- 22 Mar, 2020 2 commits
-
-
Calogero Zarbo authored
-
kouml authored
* remove session_params in deepspeed_constants.py * add constants info at README.md
-
- 18 Mar, 2020 4 commits
-
-
Shaden Smith authored
* Better config filename * Clean up configuration ToC
-
Shaden Smith authored
* fix docs permalink * fix docs permalink
-
Shaden Smith authored
-
Shaden Smith authored
* Add coming soon to posts * Add what's new section to main page
-
- 17 Mar, 2020 5 commits
-
-
Shaden Smith authored
-
Shaden Smith authored
-
Shaden Smith authored
GitHub created a CNAME for us automatically. Cool.
-
Shaden Smith authored
-
Shaden Smith authored
-
- 12 Mar, 2020 1 commit
-
-
Jeff Rasley authored
* add support for torch 1.3+ builds inside a docker build environment * remove apex imports
-
- 11 Mar, 2020 2 commits
-
-
Jeff Rasley authored
-
Jeff Rasley authored
* allow installing a specific apex commit
-
- 10 Mar, 2020 4 commits
-
-
Samyam Rajbhandari authored
* Enhancement: Ability to load checkpoint without loading the optimizer states. Unittest testing saving and loading checkpoint with fused, unfused and zero optimizer. The unitest takes about 165s
-
Olatunji Ruwase authored
* add tests cases for onecycle policy with fp16/zero * Make lr schedulers support fp16 optimizers * Fix formatting * More specific naming Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
Shaden Smith authored
-
Cola authored
-
- 09 Mar, 2020 1 commit
-
-
Incomplete authored
* Add --no_sudo to run without sudo * Add --pip_mirror to set the pip mirror * Default to running pip without sudo * Typo * Add --pip_sudo to Dockerfile and azure-pipelines.yml Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
- 07 Mar, 2020 1 commit
-
-
Olatunji Ruwase authored
-
- 03 Mar, 2020 1 commit
-
-
Jeff Rasley authored
* add support for deepspeed env file to pass custom env values * simplify deepspeed config example
-
- 27 Feb, 2020 4 commits
-
-
Jeff Rasley authored
-
Jeff Rasley authored
* add text about mpirun
-
Jeff Rasley authored
* add mpirun support for openmpi 4.0 * add master addr support from args * switch mpi detection to use mpi4py * set constant for default distributed port * Make sure deepspeed_mpi exits in args
-
Jeff Rasley authored
-
- 26 Feb, 2020 1 commit
-
-
Jeff Rasley authored
* add auto-detect to torch dist init * update tests to infer distributed init status * prevent crash if dist_init_required is True but already initiliazed * only init if safe to do so (forgot to add this file in prev commit)
-
- 25 Feb, 2020 2 commits
-
-
zenlytix authored
* Update scripts to handle cases where you have other VMs in your sub * Support subs with other VMs and fix for PDSH permission error * Minor fix to support subs with other VMs * Added shutdown with or without delete VM option In Azure deallocate is like machine shutdown (and prevents billing). You can restart deallocated VM. To fully drop the VM delete is used. This command with "-d" option will fully delete the VM. Without any argument it justs deallocates / shutd down the VM.
-
zenlytix authored
* Update scripts to handle cases where you have other VMs in your sub * Support subs with other VMs and fix for PDSH permission error * Minor fix to support subs with other VMs
-
- 24 Feb, 2020 3 commits
-
-
Jeff Rasley authored
-
Shaden Smith authored
-
Shaden Smith authored
* Removes DeepSpeedDataSource * dropping unused imports Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
- 22 Feb, 2020 1 commit
-
-
Olatunji Ruwase authored
* Support legacy optimizer fusion as config option * Configure for legacy optimizer fusion * Update configuration jsons for new apex
-
- 20 Feb, 2020 3 commits
-
-
Jeff Rasley authored
Also a fix for #94
-
Jeff Rasley authored
Co-authored-by:Shaden Smith <ShadenTSmith@gmail.com>
-
Shaden Smith authored
-
- 15 Feb, 2020 2 commits
-
-
kouml authored
* add install requirements command line * add pillow library to fix version * modify to uppercase Co-authored-by:Jeff Rasley <jerasley@microsoft.com>
-
Jeff Rasley authored
bug fixes for adamw/lamb and corresponding tests
-
- 14 Feb, 2020 2 commits
-
-
Shaden Smith authored
* Porting BingBertSquad test * Updating default paths. * Enable model tests. * Updating DeepSpeedExamples submodule * Adding BingBertSquad's log uploads. * Messed up the submodule again :-)
-
Jeff Rasley authored
* Set up CI with Azure Pipelines for docker build/push
-
- 13 Feb, 2020 1 commit
-
-
Rahul Prasad authored
-