[Enhancement] Enhance FP8/FP4 type handling in CUDA codegen (#323)
* [Enhancement] Introduce CUDA driver module and refactor CUDA device handling
- Added a new `cuda_driver` module to encapsulate CUDA device properties and functionalities.
- Updated `CUDA` class in `cuda.py` to utilize the new driver for fetching device name and shared memory capabilities.
- Introduced `get_device_name` and `get_shared_memory_per_block` functions in the `cuda_driver` for improved device property management.
- This refactor enhances code organization and maintainability while improving the handling of CUDA device attributes.
* [Refactor] Clean up whitespace in CUDA-related files
- Removed unnecessary blank lines in `cuda.py`, `__init__.py`, and `cuda_driver.py` to improve code readability and maintainability.
- This change enhances the overall organization of the codebase without altering functionality.
* [Benchmark] Add FP8 Matrix Multiplication Benchmark Script
- Introduced a new benchmark script for FP8 matrix multiplication in `benchmark/matmul_fp8/benchmark_matmul.py`.
- The script includes functions for reference matrix multiplication, configuration generation for autotuning, and an autotuned kernel for performance measurement.
- Added command-line argument parsing for matrix dimensions and the option to enable BitBLAS roller for search space exploration.
- The benchmark computes and prints the best latency and performance metrics, enhancing the benchmarking capabilities for FP8 operations.
* lint fix
* Update submodule and enhance FP8 type handling in CUDA codegen
- Updated the TVM submodule to the latest commit.
- Modified FP8 type handling in `codegen_cuda.cc` to use more descriptive type codes.
- Improved constant printing for FP8 and bfloat16 types, ensuring correct representation in generated code.
- Added error handling for missing configuration keys in the AutoTuner class.
* lint fix
* Remove print statement from example script
* lint fix
* fix
---------
Co-authored-by:
LeiWang1999 <wyatuestc@gmail.com>
Showing
Please register or sign in to comment