"git@developer.sourcefind.cn:gaoqiong/migraphx.git" did not exist on "b031c76ce0e1494b130d8b244278ab570a44c828"
  • Gabriel Wu's avatar
    feat(cutedsl): add CuTeDSL backend (#1421) · 7248a810
    Gabriel Wu authored
    
    
    * feat: CuTeDSL backend
    
    * fix: clang-tidy
    
    * fix: clang-format
    
    * fix: ci
    
    * fix: revert example gemm fp8
    
    * fix: remove duplicate code
    
    * fix: switch-case
    
    * fix: fp16 silence
    
    * fix: TVM IR print
    
    * fix: useless tir
    
    * fix: clang-format
    
    * fix: remove tilelang/contrib/cutedsl/.gitignore
    
    * fix: use hexfloat
    
    * fix: gsym guard
    
    * fix: unknown storage sync type
    
    * fix: string literal
    
    * fix: add args guard
    
    * fix: name hint dedup
    
    * fix: better find_kernel_by_pattern
    
    * fix: set libpath for from_database path
    
    * fix: guard buffer.strides
    
    * fix: from guard
    
    * fix: eviction guard
    
    * fix: use thread local tma descs
    
    * fix: ruff
    
    * fix: drop tma_init_cpp
    
    * fix: exc_info
    
    * fix: negative unmatch early return
    
    * fix: rename postproc func and add test
    
    * fix: handle fast math according to pass config
    
    * fix: dyn_sym parse
    
    * fix: wrap_forward
    
    * fix: use tvm_ffi.libinfo instead of cli
    
    * fix: keep signature
    
    * fix: C++ string safety
    
    * fix: mark tma_store_add as unsupported
    
    * fix: tvm version
    
    * resolve ldsm and cpasync issues.
    
    * fix: minor fixes
    
    * fix: parse signature using ast
    
    * fix: guard global_addr
    
    * fix: create tempfile only when necessary
    
    * fix: use logger.execption for exceptions
    
    * fix: guard lib_path and host_func
    
    * fix: remove tma_cpp_init and add timeout for cpp compile
    
    * add timeout for mbarrier_wait.
    
    * fix: _load_kernel_from_disk signature
    
    * resolve codegen issues.
    
    * fix: logger.exception
    
    * add comment for div_by=1
    
    * merge
    
    * fix: reserve cutlass,cute,tl
    
    * fix: guard tma_store
    
    * fix: allow int64 offset in make_tensor_at_offset
    
    * fix: guard barrier
    
    * fix: add comments for div_by=16
    
    * fix: div_by=1 issue
    
    * delete div_by when offset is 0
    
    * use tl.make_tensor when offset is 0
    
    * fix: explicitly check cutedsl target
    
    * fix: use param.torch_dtype()
    
    ---------
    Co-authored-by: default avataryuxic <yuxic@nvidia.com>
    Co-authored-by: default avatarYong <yong@local>
    Co-authored-by: default avatarLeiWang1999 <leiwang1999@outlook.com>
    7248a810
rt_mod_cutedsl.cc 2.32 KB