[Dev] Remove unnecessary python dependencies (#69)
* [Enhancement] Add VectorizeLoop function and update imports for compatibility * [CI][Test] Improve test cases for vectorization and fix typos in parser comments * lint fix * Fix incorrect module reference for VectorizeLoop transformation * Refactor vectorize_loop transformation by removing unused extent mutation logic * [Enhancement] Add support for FP8 data types and global barriers in CUDA codegen * Fix formatting in CUDA FP8 header file for consistency * Refactor CI workflow to use 'tilelang_ci' virtual environment and update CUDA type printing for better clarity * Update submodule 'tvm' to latest commit for improved functionality * Refactor execution backend references from 'dl_pack' to 'dlpack' for consistency and clarity; add apply_simplify function to simplify PrimFunc or IRModule. * Refactor CUDA code for improved readability; clean up formatting and remove unnecessary whitespace in multiple files. * Refactor import statement in test_tilelang_kernel_dequantize_gemm.py to use 'tilelang.language' for consistency * Add CUDA requirements to FP8 test cases and update references for clarity * Add a blank line for improved readability in test_tilelang_kernel_fp8_gemm_mma.py * Fix data type in reference result calculation for consistency in test_tilelang_kernel_gemm_mma_intrinsic.py * Add CUDA requirements and FP8 test cases for matmul and gemv simulations * Remove debug print statements and use tilelang's testing assertion for result validation in test_tilelang_kernel_gemm_mma_intrinsic.py * Remove outdated comment regarding FP8 tests in test_tilelang_kernel_gemv_simt.py * Add BF16 support to matrix multiplication and introduce corresponding test cases * Add a blank line for improved readability in BF16 GEMM test * Update acknowledgements in README to include supervision by Zhi Yang at Peking University * enhance acknowledgement * Replace tutorial on memory layout optimization with new tutorial on writing high-performance kernels with thread primitives * Update subproject commit for TVM dependency * Update subproject commit for TVM dependency * Add int4_t type and functions for packing char values in CUDA common header * Add plot_layout example and implement GetForwardVars method in layout classes * Refactor code for improved readability by adjusting line breaks and formatting in layout and test files * Fix formatting by removing unnecessary line break in layout.h * Refactor make_int4 function for improved readability by adjusting parameter formatting * Add legend to plot_layout for improved clarity of thread and local IDs * Remove unnecessary dependencies from requirements files for cleaner setup * Remove flash_mha.py and add .gitkeep to deepseek_mla directory * Add build requirements and update installation scripts for improved setup
Showing
requirements-build.txt
0 → 100644
| # build requirements | |||
| cmake>=3.26 | |||
| # runtime requirements | # runtime requirements | ||
| cffi | |||
| cpplint | |||
| Cython | Cython | ||
| decorator | decorator | ||
| docutils | |||
| dtlib | |||
| numpy>=1.23.5 | numpy>=1.23.5 | ||
| pytest>=6.2.4 | |||
| pytest_xdist>=2.2.1 | |||
| packaging>=21.0 | |||
| PyYAML | |||
| tqdm>=4.62.3 | tqdm>=4.62.3 | ||
| typing_extensions>=4.10.0 | typing_extensions>=4.10.0 | ||
| requests | |||
| attrs | attrs | ||
| cloudpickle | cloudpickle | ||
| ml_dtypes | ml_dtypes | ||
| psutil | psutil | ||
| scipy | |||
| tornado | |||
| torch | torch | ||
| thefuzz | |||
| tabulate |
Please register or sign in to comment