• Lei Wang's avatar
    [CUDA] Add read-only parameter annotation for CUDA codegen (#1416) · 00dd7388
    Lei Wang authored
    * [Enhancement] Add read-only parameter annotation for CUDA codegen
    
    * Introduced the `AnnotateReadOnlyParams` transformation to annotate read-only handle parameters in PrimFuncs, enabling the generation of `const` qualifiers in CUDA codegen.
    * Updated `PrintFunctionSignature` and `AddFunction` methods to utilize the new attribute `tl.readonly_param_indices`, enhancing performance by allowing read-only cache loads.
    * Modified the optimization pipeline to include the new annotation step, improving the overall efficiency of the code generation process.
    
    * lint fix
    
    * [Dependency] Update apache-tvm-ffi version to >=0.1.3
    
    * Updated the version of apache-tvm-ffi in pyproject.toml, requirements.txt, and requirements-dev.txt to ensure compatibility with the latest features and fixes.
    * Made adjustments in CUDA and HIP template files to use `const` qualifiers for global pointer parameters, enhancing code safety and clarity.
    
    * lint fix
    
    * [Enhancement] Refactor ReadWriteMarker for improved parameter handling
    
    * Updated the ReadWriteMarker class to accept a set of parameter or data variables, enhancing its ability to track written variables.
    * Introduced a new method, ResolveDataVarFromPtrArg, to resolve underlying buffer data from pointer-like arguments, improving accuracy in identifying written variables.
    * Modified the MarkReadOnlyParams function to gather handle parameters and their corresponding buffer data variables, streamlining the process of determining read-only parameters.
    * Enhanced the logic for identifying written variables to account for aliased data variables, ensuring comprehensive tracking of modifications.
    
    * lint fix
    
    * Update tma_load function to use const qualifier for global memory pointer
    
    * Changed the parameter type of gmem_ptr in the tma_load function from void* to void const* to enhance type safety and clarity in memory operations.
    * This modification ensures that the function correctly handles read-only global memory pointers, aligning with best practices in CUDA programming.
    
    * Remove commented-out code and reorder transformations in OptimizeForTarget function for clarity
    
    * Refactor buffer marking logic in annotate_read_only_params.cc to improve accuracy in identifying written variables. Update OptimizeForTarget function to reorder transformations for better clarity.
    00dd7388
codegen_cuda.cc 125 KB