- 29 Nov, 2015 6 commits
-
-
Davis King authored
-
Davis King authored
train_one_step() member function. Also improved how the host to device transfers are overlapped with kernel computation.
-
Davis King authored
-
Davis King authored
-
Davis King authored
to finish before overwriting the device memory with updated values from the host.
-
Davis King authored
currently active device id if the user changes the active device via a call to cudaSetDevice().
-
- 26 Nov, 2015 1 commit
-
-
Davis King authored
-
- 21 Nov, 2015 10 commits
-
-
Davis King authored
-
Davis King authored
-
Davis King authored
-
Davis King authored
-
Davis King authored
-
Davis King authored
than adding them. This way, the gradient buffer can be used as scratch space during the loss computation.
-
Davis King authored
-
Davis King authored
-
Davis King authored
-
Davis King authored
-
- 20 Nov, 2015 5 commits
-
-
Davis King authored
-
Davis King authored
-
Davis King authored
-
Davis King authored
-
Davis King authored
-
- 19 Nov, 2015 1 commit
-
-
Davis King authored
-
- 18 Nov, 2015 5 commits
-
-
Davis King authored
-
Davis King authored
-
Davis King authored
-
Davis King authored
-
Davis King authored
runs on the GPU, and made affine_transform() take only tensors.
-
- 16 Nov, 2015 5 commits
-
-
Davis King authored
-
Davis King authored
-
Davis King authored
-
Davis King authored
-
Davis King authored
use either CPU or GPU. Fixed a bug in gemm().
-
- 13 Nov, 2015 5 commits
-
-
Davis King authored
-
Davis King authored
form.
-
Davis King authored
in-place.
-
Davis King authored
-
Davis King authored
-
- 11 Nov, 2015 2 commits
-
-
Davis King authored
-
Davis King authored
-