- 22 Oct, 2025 1 commit
-
-
Evgeny Tsykunov authored
* rename experimental -> custom_recipes Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Decouple python base classes (api) Signed-off-by:
Evgeny <etsykunov@nvidia.com> * update test_custom_recipe Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Rename experimental -> custom Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Minor Signed-off-by:
Evgeny <etsykunov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix import Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Update tests/pytorch/nvfp4/test_nvfp4_rht_quantize_exact.py Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Evgeny Tsykunov <e.tsykunov@gmail.com> * Update tests/pytorch/test_custom_recipe.py Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by:
Evgeny Tsykunov <e.tsykunov@gmail.com> * quantization_base -> quantized_tensor rename Signed-off-by:
Evgeny <etsykunov@nvidia.com> --------- Signed-off-by:
Evgeny <etsykunov@nvidia.com> Signed-off-by:
Evgeny Tsykunov <e.tsykunov@gmail.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 14 Jul, 2025 1 commit
-
-
hx authored
* optimize permute Signed-off-by:
Hongxiao Bai <hongxiaob@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix lint Signed-off-by:
Xin Yao <xiny@nvidia.com> --------- Signed-off-by:
Hongxiao Bai <hongxiaob@nvidia.com> Signed-off-by:
Xin Yao <xiny@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by:
Xin Yao <xiny@nvidia.com>
-
- 14 Apr, 2025 1 commit
-
-
Autumn1998 authored
* add support for new recipe on permute_fusion, rm fp unpermute Signed-off-by:
tongliu <tongliu@nvidia.com> * fix lint Signed-off-by:
Xin Yao <xiny@nvidia.com> * remove fp8 from index map Signed-off-by:
Xin Yao <xiny@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * skip unsupported tests Signed-off-by:
Xin Yao <xiny@nvidia.com> --------- Signed-off-by:
tongliu <tongliu@nvidia.com> Signed-off-by:
Xin Yao <xiny@nvidia.com> Co-authored-by:
tongliu <tongliu@nvidia.com> Co-authored-by:
Xin Yao <xiny@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
-
- 18 Feb, 2025 1 commit
-
-
hx authored
* add prob permute; fix fp8tensor Signed-off-by:
Hongxiao Bai <hongxiaob@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert unnecessary changes in UT Signed-off-by:
Hongxiao Bai <hongxiaob@nvidia.com> * remove unnecessary probs dtype convert Signed-off-by:
Hongxiao Bai <hongxiaob@nvidia.com> * keep the output nums if probs is not provided Signed-off-by:
Hongxiao Bai <hongxiaob@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refine the doc string Signed-off-by:
Hongxiao Bai <hongxiaob@nvidia.com> * fix lint Signed-off-by:
Hongxiao Bai <hongxiaob@nvidia.com> * use fp32 compute type Signed-off-by:
Hongxiao Bai <hongxiaob@nvidia.com> * style fix Signed-off-by:
Hongxiao Bai <hongxiaob@nvidia.com> * fix empty input return Signed-off-by:
Hongxiao Bai <hongxiaob@nvidia.com> * separate prob related functions out Signed-off-by:
Hongxiao Bai <hongxiaob@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by:
Hongxiao Bai <hongxiaob@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by:
Xin Yao <xiny@nvidia.com> Co-authored-by:
Phuong Nguyen <phuonguyen@nvidia.com>
-
- 07 Feb, 2025 1 commit
-
-
Przemek Tredak authored
Signed-off-by:Przemek Tredak <ptredak@nvidia.com>
-
- 27 Jan, 2025 1 commit
-
-
hx authored
* add mask-based moe permutation * change moe_chunk_permute to moe_sort_chunks_by_indices * fix __all__ in pytorch/permutation.py * fix func/var names and typos; update tols in UT --------- Signed-off-by:
Hongxiao Bai <hongxiaob@nvidia.com> Co-authored-by:
Phuong Nguyen <phuonguyen@nvidia.com> Co-authored-by:
Tim Moon <tmoon@nvidia.com>
-
- 02 Jan, 2025 1 commit
-
-
Kirthi Shankar Sivamani authored
Signed-off-by:Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 16 Oct, 2024 1 commit
-
-
Kirthi Shankar Sivamani authored
* Upgrade pylint and first round formatting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * round 2 Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * round 3 Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Format and fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Paddle lint Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Reviews Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * FIxes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * More linting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Run formatter Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Paddle lint Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 29 Aug, 2024 1 commit
-
-
Xin Yao authored
* remove dtype from args * update docs with permutation ops --------- Signed-off-by:Xin Yao <xiny@nvidia.com>
-
- 22 Aug, 2024 1 commit
-
-
NVJiangShao authored
* Add permutation functions * Add permutation ops * Remove the dependency on cutlass * Move permutation.py out of module dir * Rewrite the unit test and enable skipping if FP8 is unavailable * Rename exposed C++ API and reorder its parameters + take NVTETensor as inputs * Use Float8Tensor for FP8 input * Move dtype to ctx --------- Signed-off-by:
Jiang Shao <jiangs@nvidia.com> Co-authored-by:
Qi Zhang <qizhang@nvidia.com> Co-authored-by:
Phuong Nguyen <36155692+phu0ngng@users.noreply.github.com>
-