• Lei Wang's avatar
    [Enhancement] Introduce `T.__ldg` (#1414) · 6f67da84
    Lei Wang authored
    * [Enhancement] Add __ldg intrinsic for CUDA read-only cache loads
    
    * Introduced the __ldg intrinsic to enable explicit read-only cached loads from global memory in CUDA.
    * Updated the corresponding documentation and added support in both CUDA and HIP code generation.
    * Enhanced the Python interface for __ldg to accept BufferLoad and Buffer types, improving usability.
    
    * [Enhancement] Update formatting and linting rules in pyproject.toml; minor test adjustment
    
    * Added new formatting rules in pyproject.toml to enforce consistent code style, including hanging indents and argument splitting.
    * Updated test_tilelang_language_intrinsics_codegen.py to improve readability by adding a blank line before the main execution block.
    * Refactored error messages in builtin.py for better clarity and consistency, ensuring proper formatting in function definitions and raising ValueErrors.
    
    * lint fix
    6f67da84
builtin.cc 14.3 KB