• Adam Osewski's avatar
    [CK-Tile] Universal gemm memory bound pipeline (#1558) · 24d996aa
    Adam Osewski authored
    * CK-Tile GEMM with memory bound pipeline.
    
    * Memory bound gemm pipeline.
    
    * Fix not closed namespace.
    
    * Block gemm mem pipeline draft.
    
    * Do not use ck_tile:: within ck_tile namespace.
    
    * Refactoring & Move Layout info to pipeline problem.
    
    * Get hot loop and TailNum information before lunching kernel.
    
    * Fixes in pipeline.
    
    * Add comment to load_tile_raw and change variable naming style.
    
    * Few small changes & formatting.
    
    * Do not use macro.
    
    * Add gtests.
    
    * Use AccDataType for Output of MFMA instruction.
    
    * Formatting.
    
    * Refactor gemm examples.
    
    * Switch over to current block gemm.
    
    * Use currently available pipeline policy.
    
    * Refactoring and review comment.s
    
    * Fixes after merge.
    
    * Add missing include.
    
    * Add load tile overload which accepts output tensor as parameter.
    
    * This give 8% perf boost at the cost of using more registers.
    
    * Rename example.
    
    * Small changes.
    
    * Fix compilation err and lower K.
    
    * Support different layouts for A/B
    
    * Fix vector size for different layouts.
    
    * Rename Alignment into VectorSize
    
    * Unblock tests.
    24d996aa
gemm.hpp 2.69 KB