1. 01 Jul, 2023 1 commit
    • AllentDan's avatar
      Add lint action (#32) · fe46dac2
      AllentDan authored
      * temp
      
      * fix lint
      
      * csrc->src
      
      * remove clang-format
      
      * skip .rst
      
      * skip doc
      
      * clang-format
      
      version
      
      version
      
      * mat_B
      fe46dac2
  2. 28 Jun, 2023 2 commits
    • tpoisonooo's avatar
      feat(src): add kv cache int8 quantization (#22) · cc93136e
      tpoisonooo authored
      * feat(src): add int8 and compile passed
      
      * feat(kernels): fix
      
      * feat(llama): update kernel
      
      * feat(src): add debug
      
      * fix(kernel): k_cache use int8_t pointer
      
      * style(llama): clean code
      
      * feat(deploy.py): revert to enable fmha
      
      * style(LlamaV2): clean code
      
      * feat(deploy.py): add default quant policy
      cc93136e
    • Li Zhang's avatar
      fix-gemm-tuning (#24) · 4d42a781
      Li Zhang authored
      4d42a781
  3. 26 Jun, 2023 1 commit
  4. 24 Jun, 2023 1 commit
  5. 22 Jun, 2023 1 commit
  6. 21 Jun, 2023 1 commit
    • q.yao's avatar
      support fmha (#9) · 6c7d9992
      q.yao authored
      * support fmha
      
      * update sm by cudaarch
      
      * update ldscript path
      
      * clang-format
      
      * clang-format
      
      ---------
      6c7d9992
  7. 20 Jun, 2023 1 commit