-
Yu Cheng authored
* Introduced `example_grouped_gemm_fwd.py` and `example_grouped_gemm_bwd.py` to demonstrate grouped matrix multiplication with forward and backward operations. * Implemented functions for grouped GEMM, input construction, and validation against PyTorch's implementation. * Added command-line argument parsing for flexible input configuration, including batch sizes and matrix dimensions. * Included a test function to validate the functionality with various input scenarios.
de028927