fix moe sorting readme and error commit file

a61ccfe8 · dummycoderfe · 9c97391c · a61ccfe8 · a61ccfe8
Commit a61ccfe8 authored Nov 01, 2024 by dummycoderfe
Hide whitespace changes
Inline Side-by-side

Showing with 4 additions and 5 deletions

cmake/EnableCompilerWarnings.cmake cmake/EnableCompilerWarnings.cmake +1 -1

example/ck_tile/13_moe_sorting/README.md example/ck_tile/13_moe_sorting/README.md +3 -4

No files found.
--- a/cmake/EnableCompilerWarnings.cmake
+++ b/cmake/EnableCompilerWarnings.cmake
@@ -66,7 +66,7 @@ else()
            -Wunreachable-code
            -Wunused
            -Wno-reserved-identifier
-            -Werror
+	    -Werror
            -Wno-option-ignored
            -Wsign-compare
            -Wno-extra-semi-stmt

--- a/example/ck_tile/13_moe_sorting/README.md
+++ b/example/ck_tile/13_moe_sorting/README.md
-# topk-softmax
+# moe-sorting
-This folder contains example for topk-softmax kernel using ck_tile tile-programming implementation. This kernel is often used in Moe model, before launching the fused-moe-gemm block. The input is a `token*expert` 2d matrix. The op will do a softmax per row(`expert`), then find the `topk` value for each row. Output is a `token*topk`  weight(usually fp32) and index(int32) 2d tensor.
+This folder contains example for moe-sorting kernel using ck_tile tile-programming implementation. This kernel is often used in Moe model, before launching the fused-moe-gemm block. The input&weight is a `token*topk` 2d matrix. The op rearange the input weight ids into different experts and feed into fuse moe gemm kernel.
 ## build
 ```
@@ -15,13 +15,12 @@ This will result in an executable `build/bin/tile_example_moe_sorting`
 ```
 args:
          -v    weather do CPU validation or not (default:1)
-       -pr_i    input data type. fp16/fp32 (representing 8/16/32 bit data) (default:fp16)
+       -pr_i    index data type. (currently only fp32 supported now) (default:int32)
       -pr_w    output weight data type(currently only fp32 supported now) (default:fp32)
          -t    number of input tokens (default:32)
          -e    number of experts (default:8)
          -k    topk (default:2)
       -st_i    row stride of input, -1 means same as experts (default:-1)
-       -st_o    row stride of output/indices, -1 means same as topk (default:-1)
       -seed    seed to be used, -1 means random every time (default:-1)
      -kname    when set to 1 it will print kernel name (default:0)