- 09 Oct, 2025 1 commit
-
-
jberchtold-nvidia authored
Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> Co-authored-by:
Phuong Nguyen <phuonguyen@nvidia.com>
-
- 30 Sep, 2025 1 commit
-
-
jberchtold-nvidia authored
Load modules during initialize Signed-off-by:
Jeremy Berchtold <jberchtold@nvidia.com> Co-authored-by:
JAX Toolbox <jax@nvidia.com>
-
- 01 Apr, 2025 1 commit
-
-
Phuong Nguyen authored
* refactor + mxfp8 * added grouped gemm * rename linear to dense * added cublas init phase for groupedGemm * relax the tol of test encoder multiprocessing mxfp8 by 0.001 Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com> --------- Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com> Co-authored-by:
Hua Huang <huah@nvidia.com> Co-authored-by:
Jeremy Berchtold <jberchtold@nvidia.com>
-
- 02 Jan, 2025 1 commit
-
-
Kirthi Shankar Sivamani authored
Signed-off-by:Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 07 Nov, 2024 1 commit
-
-
Phuong Nguyen authored
* added prepare phase for the FusedAttnForwardFFI * enabled FusedAttnForwardFFI by default * moved prepare phase into pybind --------- Signed-off-by:Phuong Nguyen <phuonguyen@nvidia.com>
-
- 31 Oct, 2024 1 commit
-
-
Phuong Nguyen authored
* lowering a dict of attrs * improve err message with line and func info * implement a product() for ffi dimensions --------- Signed-off-by:Phuong Nguyen <phuonguyen@nvidia.com>
-
- 17 Oct, 2024 1 commit
-
-
Phuong Nguyen authored
* register CmdBufferCompatible traits via C++ API * renamed FFI_Traits * use register_ffi_target() --------- Signed-off-by:Phuong Nguyen <phuonguyen@nvidia.com>
-
- 14 Aug, 2024 1 commit
-
-
Phuong Nguyen authored
* implemented custom call with ffi in csrc * moved headers of misc to misc.h, add ffi.h * ActLu and DActLu lowering with ffi_lowering * CastTranspose with ffi_lowering * enabled cudaGraph * added 4d input test case to TestActivationLu * added operand_output_aliases for CastTranspose * added env var NVTE_JAX_WITH_FFI, default value = 1 * replace casting ActivationEnum by taking its value --------- Signed-off-by:Phuong Nguyen <phuonguyen@nvidia.com>
-