- 05 Feb, 2024 1 commit
-
-
Thorsten Kurth authored
* initial implementation of distributed DISCO layer * working distributed convolution * working refactored serial conv transpose with torch kernel * working distributed conv and transposed conv when using the python kernel * working distributed convolution with torch kernel * fixed triton kernel tests * adding print statement to debug CI * adjusting tolerances in local convolution unittest --------- Co-authored-by:Boris Bonev <bbonev@nvidia.com>
-
- 22 Dec, 2023 1 commit
-
-
Boris Bonev authored
Changed the code to only implicitly use sparse tensors in the modules, in order to enable compatibility with DDP
-
- 20 Dec, 2023 1 commit
-
-
Boris Bonev authored
* Moved convolutions and exposed them directly * Added transposition to the unit test * Minor bugfix in CPU version of DISCO transpose code * Adding convolution tests to CI * Added gradient check * Checking the weight grad as well * Added test for anisotropic kernels
-