• q.yao's avatar
    Tensor Parallel python api (#82) · 7cbfe2ea
    q.yao authored
    * wip
    
    * profile disable tp
    
    * fix profile
    
    * lint
    
    * fix dlpack
    
    * remove comment
    
    * add tp flag
    
    * add session len check
    
    * add eos
    
    * remove tp and session len inputs
    
    * warp tokenizer
    
    * multithread load weight
    
    * update profile
    
    * refactor tokenizer
    
    * remove pre/post process
    
    * remove mpi4py requirement
    
    * remove
    
    * remove bind
    
    * remove mpi requirement
    
    * check backend_tokenizer
    7cbfe2ea
test_tokenizer.py 334 Bytes