• Po Yen Chen's avatar
    Add example of Gemm + AddAddFastGelu (data type: int4) (#369) · 2327f1a6
    Po Yen Chen authored
    * Add custom target to bundle examples together
    
    * Add int4 example conditionally (just copy from int8 example)
    
    * Extract common code into common.hpp
    
    * Move ref gemm type alias into data-type-specific sources
    
    * Add #error directive to prevent compile with wrong setting
    
    * Let AddAddFastGelu support int4 parameter type
    
    * Let check_err() support int4 parameter type
    
    * Add wrapper function to hide value conversion while copying memory
    
    * Finish int4 example for GEMM + AddAddFastGelu
    
    * Add new DeviceMem API to copy memory
    
    * Use new DeviceMem API to implement examples
    
    * Fix wrongly use of macro 'CK_EXPERIMENTAL_BIT_INT_EXTENSION_INT4'
    
    * Revert "Add new DeviceMem API to copy memory"
    
    This reverts commit e26e7af71e1f982a4ca7406401e2fc9b1f086b32.
    
    * Add conversion ctor for Tensor<>
    
    * Add 'const' specifier to Tensor<>::CopyAsType()
    
    * Convert Tensor<> values before/after transfer between host & device
    2327f1a6
common.hpp 3.13 KB