1. 18 Jul, 2023 1 commit
    • q.yao's avatar
      Tensor Parallel python api (#82) · 7cbfe2ea
      q.yao authored
      * wip
      
      * profile disable tp
      
      * fix profile
      
      * lint
      
      * fix dlpack
      
      * remove comment
      
      * add tp flag
      
      * add session len check
      
      * add eos
      
      * remove tp and session len inputs
      
      * warp tokenizer
      
      * multithread load weight
      
      * update profile
      
      * refactor tokenizer
      
      * remove pre/post process
      
      * remove mpi4py requirement
      
      * remove
      
      * remove bind
      
      * remove mpi requirement
      
      * check backend_tokenizer
      7cbfe2ea
  2. 03 Jul, 2023 1 commit
  3. 01 Jul, 2023 2 commits
  4. 21 Jun, 2023 1 commit
    • q.yao's avatar
      support fmha (#9) · 6c7d9992
      q.yao authored
      * support fmha
      
      * update sm by cudaarch
      
      * update ldscript path
      
      * clang-format
      
      * clang-format
      
      ---------
      6c7d9992
  5. 20 Jun, 2023 1 commit