• Muyang Li's avatar
    feat: pythonized model and QwenImage Support (#593) · f86ad470
    Muyang Li authored
    * start refract the codebase
    
    * update
    
    * update
    
    * start to implement ops
    
    * add gemm
    
    * write the docstrings
    
    * define the w4a4 svdq linear
    
    * update
    
    * make the linter happy
    
    * finished the SVDQW4A4Linear
    
    * finished the SVDQW4A4Linear
    
    * update
    
    * update
    
    * add a patcher to the model
    
    * update
    
    * add adanormsinglezero
    
    * update
    
    * update
    
    * finished the naive implementation of nunchaku flux
    
    * add ff
    
    * finished the naive forward
    
    * update
    
    * svdq linear
    
    * start debugging
    
    * fix some issues
    
    * successfully built the model
    
    * update
    
    * successfully load the model
    
    * update
    
    * update
    
    * update
    
    * try to making it runnable
    
    * debugging
    
    * debugging
    
    * debugging
    
    * add bias to awq linear
    
    * run through
    
    * fix the normalization
    
    * update
    
    * update
    
    * update
    
    * fix the attention
    
    * fix the no fuse nvfp models
    
    * update
    
    * finished the fused ff
    
    * make linter happy
    
    * make linter happy
    
    * make linter happy
    
    * debugging the fp16 attn
    
    * nunchaku fp16 is buggy
    
    * finish the fp16 attn
    
    * fp4 done
    
    * fix the lora scales
    
    * add a default value for alpha; need to debug int4
    
    * fix input4
    
    * update
    
    * update
    
    * ff does not work
    
    * specialize the processors
    
    * qwen transformer done. start debugging
    
    * make linter happy
    
    * add schnell v2 for metrics eval
    
    * chore: schnellv2 eval
    
    * update
    
    * ff and attention correct
    
    * need to check what happened to module
    
    * fp4 done
    
    * make linter happy
    
    * update an example script
    
    * reformat
    
    * add an example script
    
    * add the annoucement
    
    * remove a misleading info
    
    * ready to release
    f86ad470
README.md 14.1 KB