- 04 Apr, 2025 2 commits
-
-
Phuong Nguyen authored
* rename QuantizeAxis to QuantizeLayout, get_layout to get_data_layout, q_axis to q_layout * add fatten_axis option * added gated act to test encoder * sharding constraint fixes * fix padding when flattening first dim needs to be padded * update test sizes so that padding is tested * rm output sharding as it can be done in the flax module * sharding scale_inv for mxfp8 --------- Signed-off-by:Phuong Nguyen <phuonguyen@nvidia.com>
-
jberchtold-nvidia authored
MXFP8 flax layer tests Signed-off-by:Jeremy Berchtold <jberchtold@nvidia.com>
-
- 01 Apr, 2025 1 commit
-
-
Phuong Nguyen authored
* refactor + mxfp8 * added grouped gemm * rename linear to dense * added cublas init phase for groupedGemm * relax the tol of test encoder multiprocessing mxfp8 by 0.001 Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com> --------- Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com> Co-authored-by:
Hua Huang <huah@nvidia.com> Co-authored-by:
Jeremy Berchtold <jberchtold@nvidia.com>
-
- 18 Feb, 2025 1 commit
-
-
Phuong Nguyen authored
flax module with compute dtype inferred from the inputs Signed-off-by:Phuong Nguyen <phuonguyen@nvidia.com>
-
- 02 Jan, 2025 1 commit
-
-
Kirthi Shankar Sivamani authored
Signed-off-by:Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 14 Jun, 2024 1 commit
-
-
Kirthi Shankar Sivamani authored
* Apply formatting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Apply formatting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 13 Jun, 2024 1 commit
-
-
Phuong Nguyen authored
* Splitted cpp_extensions.py, renamed mlp.py and fused_attn.py Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com> * fixed import in tests Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com> --------- Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com>
-
- 12 Jun, 2024 1 commit
-
-
Ming-Xu Huang authored
* Reformat FP8 Meta 1. Reformat FP8 meta to be one-set-per-tensor. 2. Remove fp8_max and scale_inv. 3. Remove unused functions in fp8.py Signed-off-by:
Ming Huang <mingh@nvidia.com> * Fix unit-tests Signed-off-by:
Ming Huang <mingh@nvidia.com> * Remove ShardingType and MajorShardingType Signed-off-by:
Ming Huang <mingh@nvidia.com> * Fix lint errors Signed-off-by:
Ming Huang <mingh@nvidia.com> * Fixed unittests. Signed-off-by:
Ming Huang <mingh@nvidia.com> * Rename few variables. Signed-off-by:
Ming Huang <mingh@nvidia.com> * Add jit to update_amax_list Signed-off-by:
Ming Huang <mingh@nvidia.com> * Fixed naming error in LayernormMLP Signed-off-by:
Ming Huang <mingh@nvidia.com> * Fixed bugs in test_distributed_layernorm_mlp.py Signed-off-by:
Ming Huang <mingh@nvidia.com> --------- Signed-off-by:
Ming Huang <mingh@nvidia.com>
-
- 11 Jun, 2024 1 commit
-
-
Phuong Nguyen authored
* added distributed test for ln_mlp primitive * added distributed test for LayerNorm layer * changed error messages --------- Signed-off-by:Phuong Nguyen <phuonguyen@nvidia.com>
-