- 08 Aug, 2024 1 commit
-
-
Reese Wang authored
* Support non-deterministic algo Signed-off-by:
Reese Wang <rewang@nvidia.com> * Refine the helper function name Signed-off-by:
Reese Wang <rewang@nvidia.com> * Move fixture to conftest.py Signed-off-by:
Reese Wang <rewang@nvidia.com> --------- Signed-off-by:
Reese Wang <rewang@nvidia.com> Co-authored-by:
Phuong Nguyen <36155692+phu0ngng@users.noreply.github.com>
-
- 06 Aug, 2024 2 commits
-
-
Reese Wang authored
* Support actlen = 0 after cuDNN 9.3.0 Signed-off-by:
Reese Wang <rewang@nvidia.com> * Add runtime_segment < max_segment tests Signed-off-by:
Reese Wang <rewang@nvidia.com> --------- Signed-off-by:
Reese Wang <rewang@nvidia.com>
-
Charlene Yang authored
* add multi-latent attention for DPA Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Jax/Paddle API Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix lint Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix typo in test script Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix too-many-boolean lint error Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * Revert "fix lint" This reverts commit 67399a3a6f45bb4ce9e5eaa6bcce40b28e347e5b. Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix stride check in get_qkv_layout Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * WIP: fix layout_thd tests Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * WIP: debug info Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix merge conflict Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix thd pad_between_seqs=False/True tests Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
-
- 25 Jul, 2024 1 commit
-
-
Kirthi Shankar Sivamani authored
* Specify python version Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Add classifiers for python Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Add utils to build wheels Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * make wheel scripts Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Add aarch Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix paddle wheel Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * PaddlePaddle only builds for x86 Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Add optional fwk deps Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Python3.8; catch install error Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * [wip] cudnn9 compile with paddle support Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * [wip] dont link cudnn Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * dlopen cudnn Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * dynamically load nvrtc Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Lint Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * remove residual packages; exclude stub from nvrtc .so search Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Exclude builtins from nvrtc .so search Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * properly include files for sdist Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * paddle wheel tie to python version Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix paddle build from src [wip] Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix workflow paddle build Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix paddle Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix paddle Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix lint from pr986 Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * fix Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Add sanity wheel test Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Add sanity import to wheel test Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * remove upper limit on paddlepaddle version Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Remove unused imports Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Remove pybind11 dependency Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix cpp tests Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Search .sos in cuda home Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixes Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * CLeanup, remove residual code Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
-
- 10 Jul, 2024 1 commit
-
-
Charlene Yang authored
* add cuDNN swa Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix SWA Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * add set_deterministic and minor fixes for swa Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * add AttentionParams Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * change window_size to int64_t; fix swa/determinism tests; cache _attention_backends Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * add window_size to get_backend; fix jax and paddle Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * minor fixes; add set_deter to bwd_impl Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix unit tests Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix FP8 tests due to determinism Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * add support matrix for SWA and bias Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * minor fixes and lint Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * minor fixes Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * add wording on window_size special cases Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * minor tweak on wording Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix jax assertion error Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix wording Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * call bwd with deterministic=true for jax/paddle Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * add determinism words in documentation Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> --------- Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
-
- 03 Jul, 2024 1 commit
-
-
Reese Wang authored
* Integrate experimental ragged offset Signed-off-by:
Reese Wang <rewang@nvidia.com> * Use per sequence based offsets Signed-off-by:
Reese Wang <rewang@nvidia.com> * Format Signed-off-by:
Reese Wang <rewang@nvidia.com> * Remove v/o_seq_offsets Signed-off-by:
Reese Wang <rewang@nvidia.com> * Add FP16 sanity tests and remove forward tests from the automatically run tests Signed-off-by:
Reese Wang <rewang@nvidia.com> * Enhance input checks Signed-off-by:
Reese Wang <rewang@nvidia.com> * Separate fused attn to 2 differnt APIs and add the docs Signed-off-by:
Reese Wang <rewang@nvidia.com> * Add experimental to the docs Signed-off-by:
Reese Wang <rewang@nvidia.com> * Fix lint Signed-off-by:
Reese Wang <rewang@nvidia.com> * Add runtime segments check Signed-off-by:
Reese Wang <rewang@nvidia.com> * Remove finished TODO Signed-off-by:
Reese Wang <rewang@nvidia.com> --------- Signed-off-by:
Reese Wang <rewang@nvidia.com>
-
- 18 Jun, 2024 1 commit
-
-
Charlene Yang authored
* simplify offset tensors Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * minor fixes; tests pass Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix C lint Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * replace with_offset with with_padding Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * replace with_padding with padded Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * minor fixes after merge Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * minor fix for fused attn fwd/bwd calls Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Jax Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adjust spacing in docstring Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix pytorch tests; fix paddle api Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix lint Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix attn_biases Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix AttnFuncWithCP backward Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix jax Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix attn with CP Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * fix paddle Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by:
Charlene Yang <8636796+cyanguwa@users.noreply.github.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
-
- 14 Jun, 2024 1 commit
-
-
Kirthi Shankar Sivamani authored
* Apply formatting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Apply formatting Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com> --------- Signed-off-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
-
- 08 Jun, 2024 1 commit
-
-
Phuong Nguyen authored
* categorized `csrc/modules.cpp` Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com> * adapted the build tool Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com> --------- Signed-off-by:
Phuong Nguyen <phuonguyen@nvidia.com>
-