• q.yao's avatar
    Support turbomind bf16 (#803) · 3295eac3
    q.yao authored
    * Add bf16 template sp
    
    * prepare merge
    
    * add enable bf
    
    * add bf16 decode attention support
    
    * fix python lint
    
    * fix yapf
    
    * fix c format
    
    * c format11
    
    * fix cast
    
    * fix on sm<80
    
    * fix linux bf162 cast
    
    * fix type cast
    
    * fix lint
    
    * support from hf pretrained
    
    * fix pybind
    
    * fix converter
    
    * add trust remote code
    
    * fix comment
    
    * fix convert qwen
    
    * fix lint
    
    * fix baichuan
    
    * update weight map
    3295eac3
converter.py 11.8 KB