Commits · 3fdb8db95a77c17462d331d7524e3e48d1fee411 · OpenDAS / apex

16 Dec, 2020 1 commit
- update readme and minor changes · 3fdb8db9
  lcskrishna authored Dec 16, 2020
  
  3fdb8db9
15 Dec, 2020 1 commit
- update readme and add a note about deprecating old hipification process · 3b917de4
  lcskrishna authored Dec 14, 2020
  
  3b917de4
21 Aug, 2020 1 commit
- update readme with ninja build instruction and pip3.6 install (#35) · e9c43d67
  Chaitanya Sri Krishna Lolla authored Aug 21, 2020
  
  e9c43d67
01 Jun, 2020 1 commit
- Add Pyprof removal warnings that point to new repo (#862) · 097238f8
  mcarilli authored Jun 01, 2020
```
Co-authored-by: Michael Carilli <mcarilli@nvidia.com>
```
  097238f8
29 May, 2020 2 commits
- Update ReadMe · 8fff447e
  Chaitanya Sri Krishna Lolla authored May 29, 2020
  
  8fff447e
- update readme · 979551d8
  lcskrishna authored May 29, 2020
  
  979551d8
10 Oct, 2019 1 commit
- Adding links to references. (maybe make this a subrepo?) · ec93c75b
  mcarilli authored Oct 10, 2019
  
  ec93c75b
08 Oct, 2019 1 commit
- Include loss scaling in README code example (#523) · e87b5799
  Jan Schlüter authored Oct 08, 2019
  
  e87b5799
27 Aug, 2019 1 commit

ptrblck authored Aug 27, 2019

* add state_dict, load_state_dict

* add test_restoring, test_loss_scale_decrease

* disable amp outputs for checkpoint tests

* add test for amp.state_dict, cleanup

* add state_dict patch, add test

* fixed testing, cleanup

* add readme for checkpointing

* add docs to source/amp

* add review changes to doc

dec4fdd6

13 Aug, 2019 1 commit

Adding PyProf to Apex (#404) · 880ab925

Marek Kolodziej authored Aug 13, 2019


Co-authored-by: Aditya Agrawal <aditya.iitb@gmail.com>
Co-authored-by: Marek Kolodziej <mkolod@gmail.com>

880ab925

24 Jun, 2019 1 commit
- Update README.md · f17cd953
  mcarilli authored Jun 24, 2019
  
  f17cd953
09 May, 2019 1 commit
- Fix link to distributed samples (#298) · 4ff153cd
  Tim Zaman authored May 09, 2019
  
  4ff153cd
30 Apr, 2019 1 commit
- Remove deprecated examples and update Docker guidance · d2ac4872
  Michael Carilli authored Apr 30, 2019
  
  d2ac4872
18 Apr, 2019 1 commit
- Update README.md (#261) · cd2708cc
  Glenn Jocher authored Apr 18, 2019
  
  cd2708cc
11 Apr, 2019 1 commit
- Patching in changes to enable multiple allreduces in flight · 8521bb22
  Michael Carilli authored Apr 11, 2019
  
  8521bb22
12 Mar, 2019 1 commit
- Update README.md · e72283ad
  mcarilli authored Mar 11, 2019
  
  e72283ad
07 Mar, 2019 2 commits
- Support for custom batch types · 589328ff
  Michael Carilli authored Mar 07, 2019
  
  589328ff
- Rearranging documentation · 4606df98
  Michael Carilli authored Mar 07, 2019
  
  4606df98
04 Mar, 2019 1 commit
- Cleaning up READMEs · df83b67e
  Michael Carilli authored Mar 04, 2019
  
  df83b67e
01 Mar, 2019 2 commits
- Update README.md · 603e17a5
  mcarilli authored Mar 01, 2019
  
  603e17a5
- Update README.md · 2c6e6490
  mcarilli authored Feb 28, 2019
  
  2c6e6490
28 Feb, 2019 1 commit
- typo · 519ff816
  vfdev authored Feb 28, 2019
  
  519ff816
20 Feb, 2019 5 commits
- Update README.md · 187ed33e
  mcarilli authored Feb 20, 2019
  
  187ed33e
- Update README.md · 5c78a50f
  mcarilli authored Feb 20, 2019
  
  5c78a50f
- Update README.md · 8542db28
  mcarilli authored Feb 20, 2019
  
  8542db28
- Update README.md · 0ff493d3
  mcarilli authored Feb 20, 2019
  
  0ff493d3
- Update README.md · 6212302e
  mcarilli authored Feb 20, 2019
  
  6212302e
28 Jan, 2019 1 commit
- Update README.md · 95fe7f6a
  mcarilli authored Jan 28, 2019
  
  95fe7f6a
31 Oct, 2018 1 commit

[WIP] Fused layer norm cuda (#69) · 1b9b65ca

Thor Johnsen authored Oct 31, 2018

* Pre-release of fused layer norm apex extension

* Remove half and __half2 specializations

* Code changes from review

1b9b65ca

30 Oct, 2018 1 commit
- Updating documentation for merged utilities · 8124fba2
  Michael Carilli authored Oct 30, 2018
  
  8124fba2
23 Oct, 2018 1 commit

[syncBN] (#48) · 81eef1ef

jjsjann123 authored Oct 23, 2018

* [syncBN]
  added syncBN in native pure python apex
  added fused cuda kernels used for sync BN. Using welford for mean/var
    optional installation using 'python setup.py install --cuda_ext'
  added unit test with side to side comparison between apex sync BN with
    PyTorch BN. Notice that for pytorch BN implementation, because of
    numerical issue for mean/var, the output will be slightly off.

* [syncBN PR]
  added fp16 support
  addressing review comments on:
    1. updating last pow 2
    2. look for import error when importing syncBN kernel

* [syncBN PR]
  added convert function to insert SyncBatchNorm
  refactored some kernel code

* fixing type issue (fp16/fp32/fp64)
added Kahan summation
editing unit test to use pytorch primitive ops with double, passing reasonable tests now

* updating tensor creation calls

* fixing the all_reduce contiguous tensor

* transposed all reduce results

* [syncBN]
support fp16 input & fp32 layer for apex fp16
partially fixing launch configs
enabling imagenet example to run with --sync_bn

* [syncBN PR]
Documentation added

* adjusting README

* adjusting again

* added some doc to imagenet example

* [syncBN]
  warp-level reduction
  bug fix: warp reduction logic updated. check for dummy element to avoid nan.
  improved launch config for better reduction kernels. Further improvements
would be to increase grid size.

* [syncBN]
  fixing undefined behavior in __shfl_down_sync from divergent threads in warp
reduction.
  changing at::native::empty to at::empty (upstream comments)

81eef1ef

08 Oct, 2018 1 commit
- Update README.md · cd788317
  mcarilli authored Oct 08, 2018
  
  cd788317
22 Aug, 2018 1 commit
- Update README.md · 437bcf22
  mcarilli authored Aug 22, 2018
  
  437bcf22
19 Aug, 2018 1 commit
- Adjusting learning rate schedule for 76% accuracy · eae8b989
  Michael Carilli authored Aug 18, 2018
  
  eae8b989
05 Jul, 2018 1 commit
- Update README with Windows build instructions · 31931985
  mcarilli authored Jul 05, 2018
  
  31931985
15 Jun, 2018 3 commits
- More docstring + README updates · 5f8c3183
  Michael Carilli authored Jun 15, 2018
  
  5f8c3183
- Updating READMEs and examples · 82d7a3bf
  Michael Carilli authored Jun 15, 2018
  
  82d7a3bf
- Updating READMEs · e215dd41
  Michael Carilli authored Jun 14, 2018
  
  e215dd41
14 Jun, 2018 2 commits
- Added Dockerfile example, more readme updates · 0f703d13
  Michael Carilli authored Jun 14, 2018
  
  0f703d13
- Rewiring READMEs · 2a2341c7
  Michael Carilli authored Jun 14, 2018
  
  2a2341c7