- 14 Feb, 2025 1 commit
-
-
Phuong Nguyen authored
* fixes L1 test * fix test_multigpu_encoder * fixes for other multi-encoder tests * jax.extend.ffi to jax.ffi * initialization with float32 * add init_dtype as an optional arg to all modules * update use_scan query from xla flags * relax threshold for test_encoder fp8 * relax the tols --------- Signed-off-by:Phuong Nguyen <phuonguyen@nvidia.com>
-
- 02 Jan, 2025 1 commit
-
-
Kirthi Shankar Sivamani authored
Signed-off-by:Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 06 Nov, 2024 1 commit
-
-
Hua Huang authored
* FFI for some transpose & activation functions Signed-off-by:
Hua Huang <huah@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove comments in transformer_engine/jax/csrc/extensions/activation.cpp Co-authored-by:
Phuong Nguyen <36155692+phu0ngng@users.noreply.github.com> Signed-off-by:
Hua Huang <huangh1994@outlook.com> --------- Signed-off-by:
Hua Huang <huah@nvidia.com> Signed-off-by:
Hua Huang <huangh1994@outlook.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by:
Phuong Nguyen <36155692+phu0ngng@users.noreply.github.com>
-
- 24 Oct, 2024 1 commit
-
-
Hua Huang authored
[JAX] XLA Custom Calls with FFI for FusedAttnFwd, Quantize, Transpose, ActLuFP8, LayerNormForwardFP8FFI, and LayerNormBackwardFFI (#1263) * Add TransposeFFI, test passed Signed-off-by:
Hua Huang <huah@nvidia.com> * Add ActLuFP8FFI; fix TransposeFFI Signed-off-by:
Hua Huang <huah@nvidia.com> * Add QuantizeFFI Signed-off-by:
Hua Huang <huah@nvidia.com> * Add FusedAttnForwardFFI and some unit tests Signed-off-by:
Hua Huang <huah@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Minor fix Signed-off-by:
Hua Huang <huah@nvidia.com> * Add LayerNormForwardFP8FFI & LayerNormBackwardFFI Signed-off-by:
Hua Huang <huah@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revise FusedAttnForwardFFI() Signed-off-by:
Hua Huang <huah@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add FFI_CudaGraph_Traits All tests passed, ready for merge Signed-off-by:
Hua Huang <huah@nvidia.com> * Bug fix for FFI data type mismatch Also add a safeguard on the entrance to FFI function Signed-off-by:
Hua Huang <huah@nvidia.com> --------- Signed-off-by:
Hua Huang <huah@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
-
- 19 Aug, 2024 1 commit
-
-
Frédéric Bastien authored
Signed-off-by:
Frederic Bastien <fbastien@nvidia.com> Co-authored-by:
Phuong Nguyen <36155692+phu0ngng@users.noreply.github.com>
-
- 14 Aug, 2024 1 commit
-
-
Phuong Nguyen authored
* implemented custom call with ffi in csrc * moved headers of misc to misc.h, add ffi.h * ActLu and DActLu lowering with ffi_lowering * CastTranspose with ffi_lowering * enabled cudaGraph * added 4d input test case to TestActivationLu * added operand_output_aliases for CastTranspose * added env var NVTE_JAX_WITH_FFI, default value = 1 * replace casting ActivationEnum by taking its value --------- Signed-off-by:Phuong Nguyen <phuonguyen@nvidia.com>
-
- 17 Jul, 2024 1 commit
-
-
Reese Wang authored
* Add enabled() to BasePrimitive * Add layernorm/rmsnorm fallback * Add cast_fp8 fallback * Add transpose/cast_transpose XLA fall back * Act_lu fallback * Add transpose fallback * Add softmax fallback * Unify the use of _cast_fp8 * Add tests for NVTE_JAX_CUSTOM_CALLS_RE --------- Signed-off-by:
Reese Wang <rewang@nvidia.com> Co-authored-by:
Phuong Nguyen <36155692+phu0ngng@users.noreply.github.com>
-
- 09 Jul, 2024 1 commit
-
-
Phuong Nguyen authored
* removed singleton wrapper Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com> --------- Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com>
-
- 14 Jun, 2024 1 commit
-
-
Kirthi Shankar Sivamani authored
* Apply formatting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Apply formatting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 13 Jun, 2024 1 commit
-
-
Phuong Nguyen authored
* Splitted cpp_extensions.py, renamed mlp.py and fused_attn.py Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com> * fixed import in tests Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com> --------- Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com>
-