• OlivierDehaene's avatar
    feat(fp8): use fbgemm kernels and load fp8 weights directly (#2248) · 53ec0b79
    OlivierDehaene authored
    * feat(fp8): add support for fbgemm
    
    * allow loading fp8 weights directly
    
    * update outlines
    
    * fix makefile
    
    * build fbgemm
    
    * avoid circular import and fix dockerfile
    
    * add default dtype
    
    * refactored weights loader
    
    * fix auto conversion
    
    * fix quantization config parsing
    
    * force new nccl on install
    
    * missing get_weights implementation
    
    * increase timeout
    53ec0b79
flash_causal_lm.py 65.5 KB