[Dev] Remove unnecessary python dependencies (#69)

* [Enhancement] Add VectorizeLoop function and update imports for compatibility * [CI][Test] Improve test cases for vectorization and fix typos in parser comments * lint fix * Fix incorrect module reference for VectorizeLoop transformation * Refactor vectorize_loop transformation by removing unused extent mutation logic * [Enhancement] Add support for FP8 data types and global barriers in CUDA codegen * Fix formatting in CUDA FP8 header file for consistency * Refactor CI workflow to use 'tilelang_ci' virtual environment and update CUDA type printing for better clarity * Update submodule 'tvm' to latest commit for improved functionality * Refactor execution backend references from 'dl_pack' to 'dlpack' for consistency and clarity; add apply_simplify function to simplify PrimFunc or IRModule. * Refactor CUDA code for improved readability; clean up formatting and remove unnecessary whitespace in multiple files. * Refactor import statement in test_tilelang_kernel_dequantize_gemm.py to use 'tilelang.language' for consistency * Add CUDA requirements to FP8 test cases and update references for clarity * Add a blank line for improved readability in test_tilelang_kernel_fp8_gemm_mma.py * Fix data type in reference result calculation for consistency in test_tilelang_kernel_gemm_mma_intrinsic.py * Add CUDA requirements and FP8 test cases for matmul and gemv simulations * Remove debug print statements and use tilelang's testing assertion for result validation in test_tilelang_kernel_gemm_mma_intrinsic.py * Remove outdated comment regarding FP8 tests in test_tilelang_kernel_gemv_simt.py * Add BF16 support to matrix multiplication and introduce corresponding test cases * Add a blank line for improved readability in BF16 GEMM test * Update acknowledgements in README to include supervision by Zhi Yang at Peking University * enhance acknowledgement * Replace tutorial on memory layout optimization with new tutorial on writing high-performance kernels with thread primitives * Update subproject commit for TVM dependency * Update subproject commit for TVM dependency * Add int4_t type and functions for packing char values in CUDA common header * Add plot_layout example and implement GetForwardVars method in layout classes * Refactor code for improved readability by adjusting line breaks and formatting in layout and test files * Fix formatting by removing unnecessary line break in layout.h * Refactor make_int4 function for improved readability by adjusting parameter formatting * Add legend to plot_layout for improved clarity of thread and local IDs * Remove unnecessary dependencies from requirements files for cleaner setup * Remove flash_mha.py and add .gitkeep to deepseek_mla directory * Add build requirements and update installation scripts for improved setup

[Dev] Remove unnecessary python dependencies (#69)
* [Enhancement] Add VectorizeLoop function and update imports for compatibility * [CI][Test] Improve test cases for vectorization and fix typos in parser comments * lint fix * Fix incorrect module reference for VectorizeLoop transformation * Refactor vectorize_loop transformation by removing unused extent mutation logic * [Enhancement] Add support for FP8 data types and global barriers in CUDA codegen * Fix formatting in CUDA FP8 header file for consistency * Refactor CI workflow to use 'tilelang_ci' virtual environment and update CUDA type printing for better clarity * Update submodule 'tvm' to latest commit for improved functionality * Refactor execution backend references from 'dl_pack' to 'dlpack' for consistency and clarity; add apply_simplify function to simplify PrimFunc or IRModule. * Refactor CUDA code for improved readability; clean up formatting and remove unnecessary whitespace in multiple files. * Refactor import statement in test_tilelang_kernel_dequantize_gemm.py to use 'tilelang.language' for consistency * Add CUDA requirements to FP8 test cases and update references for clarity * Add a blank line for improved readability in test_tilelang_kernel_fp8_gemm_mma.py * Fix data type in reference result calculation for consistency in test_tilelang_kernel_gemm_mma_intrinsic.py * Add CUDA requirements and FP8 test cases for matmul and gemv simulations * Remove debug print statements and use tilelang's testing assertion for result validation in test_tilelang_kernel_gemm_mma_intrinsic.py * Remove outdated comment regarding FP8 tests in test_tilelang_kernel_gemv_simt.py * Add BF16 support to matrix multiplication and introduce corresponding test cases * Add a blank line for improved readability in BF16 GEMM test * Update acknowledgements in README to include supervision by Zhi Yang at Peking University * enhance acknowledgement * Replace tutorial on memory layout optimization with new tutorial on writing high-performance kernels with thread primitives * Update subproject commit for TVM dependency * Update subproject commit for TVM dependency * Add int4_t type and functions for packing char values in CUDA common header * Add plot_layout example and implement GetForwardVars method in layout classes * Refactor code for improved readability by adjusting line breaks and formatting in layout and test files * Fix formatting by removing unnecessary line break in layout.h * Refactor make_int4 function for improved readability by adjusting parameter formatting * Add legend to plot_layout for improved clarity of thread and local IDs * Remove unnecessary dependencies from requirements files for cleaner setup * Remove flash_mha.py and add .gitkeep to deepseek_mla directory * Add build requirements and update installation scripts for improved setup
2411fa28 · Lei Wang · GitHub · f9b6a92e · 2411fa28 · 2411fa28
Commit 2411fa28 authored Feb 10, 2025 by Lei Wang Committed by GitHub Feb 10, 2025
10 changed files
--- a/examples/deepseek_mla/.gitkeep
+++ b/examples/deepseek_mla/.gitkeep
--- a/install_cpu.sh
+++ b/install_cpu.sh
@@ -7,6 +7,7 @@ echo "Starting installation script..."
 # Step 1: Install Python requirements
 echo "Installing Python requirements from requirements.txt..."
+pip install -r requirements-build.txt
 pip install -r requirements.txt
 if [ $? -ne 0 ]; then
    echo "Error: Failed to install Python requirements."

--- a/install_cuda.sh
+++ b/install_cuda.sh
@@ -7,6 +7,7 @@ echo "Starting installation script..."
 # Step 1: Install Python requirements
 echo "Installing Python requirements from requirements.txt..."
+pip install -r requirements-build.txt
 pip install -r requirements.txt
 if [ $? -ne 0 ]; then
    echo "Error: Failed to install Python requirements."

--- a/install_rocm.sh
+++ b/install_rocm.sh
@@ -3,9 +3,17 @@
 # Copyright (c) Microsoft Corporation.
 # Licensed under the MIT License.
+echo "Starting installation script..."
 # install requirements
+pip install -r requirements-build.txt
 pip install -r requirements.txt
+if [ $? -ne 0 ]; then
+    echo "Error: Failed to install Python requirements."
+    exit 1
+else
+    echo "Python requirements installed successfully."
+fi
 # determine if root
 USER_IS_ROOT=false
 if [ "$EUID" -eq 0 ]; then

--- a/pyproject.toml
+++ b/pyproject.toml
@@ -3,7 +3,6 @@ requires = [
    "cmake>=3.26",
    "packaging",
    "setuptools>=61",
-    "setuptools-scm>=8.0",
    "wheel",
 ]
 build-backend = "setuptools.build_meta"

--- a/requirements-build.txt
+++ b/requirements-build.txt
+# Should be mirrored in pyproject.toml
+cmake>=3.26
+packaging
+setuptools>=61
+torch
+wheel
--- a/requirements-dev.txt
+++ b/requirements-dev.txt
@@ -28,7 +28,6 @@ cloudpickle
 ml_dtypes
 psutil
 scipy
-tornado
 torch
 thefuzz
 tabulate

--- a/requirements-test.txt
+++ b/requirements-test.txt
@@ -27,8 +27,6 @@ attrs
 cloudpickle
 ml_dtypes
 psutil
-scipy
-tornado
 torch
 thefuzz
 tabulate

--- a/requirements.txt
+++ b/requirements.txt
-# build requirements
-cmake>=3.26
 # runtime requirements
-cffi
-cpplint
 Cython
 decorator
-docutils
-dtlib
 numpy>=1.23.5
-pytest>=6.2.4
-pytest_xdist>=2.2.1
-packaging>=21.0
-PyYAML
 tqdm>=4.62.3
 typing_extensions>=4.10.0
-requests
 attrs
 cloudpickle
 ml_dtypes
 psutil
-scipy
-tornado
 torch
-thefuzz
-tabulate
--- a/tilelang/tools/plot_layout.py
+++ b/tilelang/tools/plot_layout.py
@@ -132,6 +132,18 @@ def plot_layout(layout: T.Layout,
    plt.xticks([])  # Remove x-axis ticks
    plt.yticks([])  # Remove y-axis ticks
+    legend_patches = [
+        patches.Patch(color='black', label="T: Thread ID"),
+        patches.Patch(color='black', label="L: Local ID")
+    ]
+    ax.legend(
+        handles=legend_patches,
+        loc="upper right",
+        fontsize=font_size - 4,
+        frameon=False,
+        bbox_to_anchor=(1.0, 1.12),
+        ncols=2)
    # Create the output directory if it does not exist
    tmp_directory = pathlib.Path(save_directory)
    if not os.path.exists(tmp_directory):