"tools/git@developer.sourcefind.cn:OpenDAS/nni.git" did not exist on "4592ba418cd470eaed2129a4600b233cb2b1900f"
Unverified Commit a86223f4 authored by alex_xiao's avatar alex_xiao Committed by GitHub
Browse files

[CI] Fix AMD CI (#729)



* [Enhancement] Refactor buffer index handling for improved precision and clarity (#668)

- Enhanced buffer index handling to address precision issues by removing redundant operations.
- Streamlined the logic for determining buffer overlaps, ensuring more accurate conflict detection.
- Updated related documentation to reflect changes in buffer management practices.

* Remove obsolete test script for AMD example, streamlining the examples directory.

* Remove unused dtype_size variable in AMD example script to streamline code.

* Add input configuration file and update AMD example script for enhanced flexibility

- Introduced a new input.txt file for configurable parameters.
- Modified the example_amd_flash_attn_fwd.py script to allow for a wider range of configurations, including additional options for num_stages, enable_rasterization, and k_pack.
- Streamlined the main function for better clarity and organization.
- Added a new test script to facilitate running the example with specified parameters.

* Remove input configuration file and obsolete test script; enhance AMD example with swizzle layout annotations

- Deleted input.txt and test.sh files as they are no longer needed.
- Updated example_amd_flash_attn_fwd.py to include swizzle layout annotations for shared memory, improving bank conflict avoidance.
- Reintroduced swizzle usage in the kernel for better performance.

* Refactor AMD example script for FlashAttention-2

- Updated function names for clarity, changing `get_v2_configs` to `get_configs` and `fast_flashattn_v2` to `fast_flashattn`.
- Streamlined the main function by renaming `main_v2` to `main` and adjusting the corresponding calls.
- Removed outdated comments and improved code organization for better readability.

* Refactor formatting in AMD FlashAttention example script

- Improved code readability by adjusting line breaks and indentation in the `fast_flashattn` function.
- Streamlined the `main` function parameter formatting for consistency.
- Removed unnecessary blank lines to enhance overall code organization.

* Update example_amd_flash_attn_fwd.py

* Enhance AMD example script and update CI workflows

- Improved the `example_amd_flash_attn_fwd.py` script for better clarity and organization.
- Added new CI workflows for AMD and documentation publishing.
- Updated various requirements files to include necessary dependencies.
- Introduced new test cases and examples for better coverage and functionality.
- Refactored existing code for improved readability and maintainability.

* Remove redundant tool cache cleanup step in AMD CI workflow

* Remove `torch` dependency from `requirements-rocm.txt` to streamline requirements.

---------
Co-authored-by: default avatarxinxyxiao <xinyxiao@amd.com>
Co-authored-by: default avatarLei Wang <34334180+LeiWang1999@users.noreply.github.com>
parent a5074fd5
...@@ -117,4 +117,4 @@ jobs: ...@@ -117,4 +117,4 @@ jobs:
source "${{ runner.tool_cache }}/${{ env.VENV_DIR }}/bin/activate" source "${{ runner.tool_cache }}/${{ env.VENV_DIR }}/bin/activate"
cd testing/python/amd cd testing/python/amd
unset PYTHONPATH unset PYTHONPATH
python -m pytest -v test_tilelang_test_amd.py python -m pytest -v test_tilelang_test_amd.py
\ No newline at end of file
...@@ -20,7 +20,6 @@ requests ...@@ -20,7 +20,6 @@ requests
cloudpickle cloudpickle
ml_dtypes ml_dtypes
psutil psutil
torch
tabulate tabulate
wheel wheel
setuptools setuptools
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment